Difference between Gaussian Random Walk and Gaussian Process

Hey, Team!

Can you please elaborate on the difference between Gaussian Random Walks and Gaussian processes.

While I seem to understand that they serve more or less the same purpose (with usecases like modelling some correlated effects in space/time), I understand that they are specified differently in PyMC.

So, opposite to Gaussian processes, GRWs require some initial distribution and do not require specifying the covariance function.

Can you please help me understand what’s the difference for using both?

Thanks in advance!


Gaussian Random Walk is a particular case of Gaussian process

1 Like

Yes, you can view a GRW as a GP with a diagonal covariance matrix. From that perspective, it’s apparent that a point x_i is only influenced by its direct neighbors, not other data points further away. So GPs give you more flexibility to include higher order effects.

At PyMC Labs, we almost never use GRWs anymore, but frequently use GPs.

1 Like

Thanks a lot everybody!
Makes a lot of sense!

A Guassian Random Walk is a time series model, where the next observation in an ordered sequence depends only on the previous observation:

\begin{align} x_{t+1} &= x_t + \varepsilon_t \\ \varepsilon_t &\sim N(0, \sigma_{\varepsilon}) \\ x_0 &\sim N(\mu_0, \sigma_0) \end{align}

So that initial distribution is telling you about where the process begins x_0, then it marches on into the future.

A Gaussian Process, on the other hand, is something like a prior over functions. It uses a “covariance function” to define a notation of similarity in the space \mathbb R^n, and then this notion of similarity is used to generate smooth functions that interpolate between datapoints. I think this video is a nice high-level introduction.

As noted above, GPs are very flexible, so a GRW can be written as a GP, but the reverse is not true. But it would also be quite overkill to use a GP to model a GRW – they are more computationally intensive, especially for high-dimensional spaces (i.e. if you wanted time-varying priors over a large number of parameters).


To add to this discussion, here’s a small extract from @ferrine 's seminar, where he mentions that Gaussian Random Walks (unlike Gaussian Processes) do not have a stationary coefficient, so one cannot interpret it as some ‘average point’ with deviations.

1 Like

Perhaps a bit pedantic, but a GP can also be non-stationary – it depends on the kernel function. See this discussion.

In general, though, I agree that this is an important difference, since non-stationarity is the defining feature of a GRW.

Just want to add that a GRW doesn’t revert to its mean, it can wander off in any direction. Where a GP (usually with Matern family covariance like ExpQuad) will revert to it’s mean as fast as the lengthscale allows. These differences in behavior are pretty important for forecasting and you may often prefer a GRW


Which kernels have no or less strong tendencies for mean reversion?

There are some trivial examples (like Periodic or Cosine or something), but I don’t think examples like those are what you’re be after. Could you maybe be more specific Bryan? Then I could try and give a better answer