I’ll preface this by saying that I struggle to use pm.AR in general, which is a personality flaw. Therefore, my approach here might be shooting a fly with a bazooka.
That said, parallel AR is a special case of a VAR, with all the cross-equation terms set to zero. A VAR can be represented as 2D convolution, which aesara has nice built-in tools to handle “easily”, once your head is wrapped around the filter shape stuff. I go over the details of this here, if anyone is interested.
Here’s how I attacked your problem as posed:
with pm.Model(coords={"time": df.index.values[1:], "series": [0, 1]}) as m:
rho = pm.Normal("rho", shape=order)
# filter shape is (output_channels, input_channels, filter_rows, filter_cols)
# for timeseries stuff:
# output_channels is always 1
# input channels is n_series
# filter_rows is n_lags
# filter_cols is n_series
VAR_filters = at.expand_dims(at.eye(n_series) * rho, axis=(1, -2))
init = pm.Normal('x0', mu=0, sigma=1, shape=n_series)
sigma = pm.HalfNormal('sigma', size=1)
data = pm.Data('data', df.values[order:, :], mutable=True)
extended_data = at.concatenate([init[None], data], axis=0)
# Each convolution produces the prediction for time t+1. As a result we have to drop the last row of data,
# since we don't have an observation that goes with it. For example, the first 4 x 3 block of data makes
# a precition for t=5 (4 lags)
# Flip the lags so the first row of the output is the 1st lag, rather than the last. I think it's more intuitive that way.
VAR_equations = at.nnet.conv2d(input=at.expand_dims(extended_data[:-1], axis=(0, 1)), # add batch_size = in_filters = 1
filters=VAR_filters[:, :, ::-1, :],
border_mode='valid',
filter_flip=False)
mean = VAR_equations[0, :, :, 0].T
obs = pm.Normal('obs', mu=mean, sigma=sigma, observed=data, dims=['time', 'equations'])
with m:
trace = pm.sample()
In this case I’ve restricted both time series to a shared AR parameter, but this framework would let you estimate an arbitrary number of series and parameters. It could even handle heterogeneous AR order between the series, if you had the patience to build the filter tensor.
Here’s the output to show that it recovers the true parameters:
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
rho[0] 0.293 0.006 0.281 0.305 0.000 0.000 3291.0 1730.0 1.0
x0[0] 0.408 0.330 -0.195 1.033 0.006 0.005 3380.0 1503.0 1.0
x0[1] -0.546 0.325 -1.131 0.090 0.006 0.005 2669.0 1435.0 1.0
sigma[0] 0.100 0.001 0.099 0.101 0.000 0.000 3142.0 1526.0 1.0