Rolling Regression in PyMC


I’m wondering if there’s an equivalent way in PyMC of doing Rolling OLS as described in the statsmodels package:

Rolling OLS applies OLS across a fixed windows of observations and then rolls (moves or slides) the window across the data set. The key parameter is window which determines the number of observations used in each OLS regression. By default, RollingOLS drops missing values in the window and so will estimate the model using the available data points.

Estimated values are aligned so that models estimated using data points i+1, i+2, ..., i+window are stored in location i+window.

Overall, instead of modeling the parameters of a linear model with a Gaussian Random Walk (like here), we trying several linear models on consecutive windows of data of fixed length and keep track of the parameters on each of these windows.

Any help is appreciated.

Sounds like an Autoregressive process? pymc.AR — PyMC 5.0.2 documentation

It’s not necessarily an autoregressive process because we are fitting a model of the form y_W = X_W\beta_W + \epsilon that only covers a window W of the data and the design matrix does not necessarily contains lags of y. The window is moving and as it does, a new vector \beta_W is computed and stored.

Rolling regression basically just runs a bunch of independent regressions and collects the results. You can do that easily in PyMC:

from functools import reduce
import pymc as pm
import arviz as az

# Other imports + data preparation as from the statsmodels example

def make_window_data(series, window_size):
    n = series.shape[0]
    col_name =
    windowed_list = [series.reset_index(drop=True).to_frame().iloc[i:window_size+i].rename(columns={col_name:i}) for i in range(n)]
    windowed_df = reduce(lambda left, right: left.join(right.reset_index(drop=True)), windowed_list)
    return windowed_df

windowed_endog = make_window_data(endog, 60)
windowed_exog = make_window_data(exog['Mkt-RF'], 60)

# Statsmodels drops the ends (because there's missing data) but in principle you don't have to do this
windowed_endog.dropna(axis=1, how='any', inplace=True)
windowed_exog.dropna(axis=1, how='any', inplace=True)

This gives you a window_size, n-window_size array of data that you’ll use to do a bunch of regressions. Then just run them all on the same MCMC chain:

coords = {

with pm.Model(coords=coords) as mod:
    betas = pm.Normal('beta', dims=['regression'])
    alphas = pm.Normal('alpha', dims=['regression'])
    sigmas = pm.HalfCauchy('sigmas', beta=0.8, dims=['regression'])
    mus = alphas + betas * windowed_exog
    obs = pm.Normal('obs', mu=mus, sigma=sigmas, observed=windowed_endog.values)
    idata = pm.sample()

The “key move” is that I use broadcasting to compute mu: the betas will be broadcast over the columns, so each column of mu corresponds to an independent regression. PyMC will also interpret this as a collection of independent normal distributions for you when you hand it to pm.Normal to compute the likelihood.

Results for the first 25 regressions (with statsmodels estimates as red triangles):


Thanks @jessegrabowski, this is super helpful. I ran the code and it works nicely. I’m trying to replicate the plot that you shared. I managed to plot a subset of the parameters using this code:

    idata.posterior.stack(sample=("chain", "draw")).sel(regression=[0,1,2,3,4]),
    var_names=["alpha", "beta"]

However, I do not know how you added the parameters from the RollingOLS models and reduced the numbers of chains to a single one. Can you give me a hint on this?

All the arviz plot functions return a list of matplotlib axis objects, so if you save it you can continue plotting on them as usual. I get the y-tick locations using yaxis.get_major_ticks().get_loc(), which combined with the parameter value itself lets you do a scatterplot:

ax = az.plot_forest(idata, var_names=['alpha', 'beta'], coords={'regression':np.arange(25)},
y_vals = reversed(list(map(lambda i: ax[0].yaxis.get_major_ticks()[i].get_loc(), np.arange(50))))
for i, y in enumerate(y_vals):
    data = params.dropna().iloc[i % 25, i // 25]
    ax[0].scatter(data, y, marker='^', zorder=100, color='tab:red')
1 Like

I see. Awesome! One last question out of curiosity: the current model definition is saying that alpha and beta are independent, but in theory is possible to write these as a multivariate normal distribution, i.e.

coords = {
    'params'=['alpha', 'beta']
coefs = pm.MvNormal('coefs', mu=mu, chol=chol, dims=['regression', 'params'])

Overall, this should enable us to provide some more prior information to these parameters. Does this make sense at all even though the columns in the data matrix represent independent regressions?

Yeah you can definitely get creative with the priors. Then the regressions aren’t independent anymore, but this should be seen as an advantage of the Bayesian method vs vanilla OLS (why should they be independent?)

Thanks, Jesse! Really appreciate your help. It has been very insightful.

1 Like