AR dims/coords/shape in v4

I’m have multiple time series of the same length with the same AR parameters. I’m struggling to figure out the shapes/dims situation in v4.

Here is a toy example:


import pandas as pd
import pymc as pm
import numpy as np
import arviz as az

def create_ar_series(length, rho, sigma):
    res = []
    lastval = 0
    for i in np.arange(length):
        nextval = lastval * rho + np.random.normal(loc=0, scale=sigma)
        lastval = nextval
        res.append(lastval)
    return res

# multiple series, same AR params
df = pd.DataFrame({'res1': create_ar_series(10000, .3, .1),
                   'res2': create_ar_series(10000, .3, .1), })

with pm.Model(coords={"time": df.index.values, "series": [0, 1]}) as m:
    rho = pm.Normal("rho", shape=1)
    init = pm.Normal.dist(0, shape=1)
    sigma = pm.HalfNormal('sigma', size=1)
    ar1 = pm.AR("ar1", rho=rho, sigma=sigma, init_dist=init,
                constant=False, observed=df[['res1', 'res2']].values, dims=("time", "series",))

with m:
    trace = pm.sample()

… results in this error:

~/.virtualenvs/m6_v3/lib/python3.9/site-packages/aesara/tensor/type.py in filter_variable(self, other, allow_convert)
    261                 return other2
    262 
--> 263         raise TypeError(
    264             f"Cannot convert Type {other.type} "
    265             f"(of Variable {other}) into Type {self}. "

TypeError: Cannot convert Type TensorType(float64, (10000, 2)) (of Variable ar1{[[-0.08402..06180672]]}) into Type TensorType(float64, (1, None)). You can try to manually convert ar1{[[-0.08402..06180672]]} into a TensorType(float64, (1, None)).

I’ve read the dimensionality docs but still couldn’t solve this.

time has to be the last dimension, but it seems you added it as the first?

Thank you Ricardo for the reply. When I put time as the last dimension, I get a different error. (Note that I’m transposing the observed data so the dimension is now 2, 10000)

import pandas as pd
import pymc as pm
import numpy as np
import arviz as az

def create_ar_series(length, rho, sigma):
    res = []
    lastval = 0
    for i in np.arange(length):
        nextval = lastval * rho + np.random.normal(loc=0, scale=sigma)
        lastval = nextval
        res.append(lastval)
    return res

df = pd.DataFrame({'res1': create_ar_series(10000, .3, .1),
                   'res2': create_ar_series(10000, .3, .1),
                  })

with pm.Model(coords={"time": df.index.values, "series": [0, 1]}) as m:
    rho = pm.Normal("rho", shape=1)
    init = pm.Normal.dist(0, shape=1)
    sigma = pm.HalfNormal('sigma', size=1)
    ar1 = pm.AR("ar1", rho=rho, sigma=sigma, init_dist=init,
                constant=False, observed=df[['res1', 'res2']].values.T, dims=("series", "time",))

with m:
    trace = pm.sample()

results in

TypeError: Cannot convert Type TensorType(float64, (2, 10000)) (of Variable ar1{[[-0.12119..07470894]]}) into Type TensorType(float64, (1, None)). You can try to manually convert ar1{[[-0.12119..07470894]]} into a TensorType(float64, (1, None)).

I am pretty naive when it comes to time series modeling, but I think AR is expecting parameters for each series you are observing. If you want to use a single set of parameters for all observed series and do it in a single likelihood, then maybe something like this?

import aesara.tensor as at

with pm.Model(coords={"time": df.index.values, "series": [0, 1]}) as m:
    rho = pm.Normal("rho")
    init = pm.Normal.dist(shape=2)
    sigma = pm.HalfNormal('sigma')
    ar1 = pm.AR("ar1",
                rho=at.stack([rho, rho]),
                sigma=at.stack([sigma, sigma]),
                init_dist=init,
                observed=df[['res1', 'res2']].to_numpy().T,
                dims=("series", "time")
    )

Of course, you should also be able build 2 separate likelihoods, which should do the same thing, but then you wouldn’t have everything in one giant data structure with series as a dimension.

Thank you for the suggestion. I just tried that and pymc interprets that as an AR2 model.

Hm, I suppose that makes sense. Then I would vote for 2 likelihoods if you need to get something running in the meantime. @RavinKumar might have further suggestions.

I’ll preface this by saying that I struggle to use pm.AR in general, which is a personality flaw. Therefore, my approach here might be shooting a fly with a bazooka.

That said, parallel AR is a special case of a VAR, with all the cross-equation terms set to zero. A VAR can be represented as 2D convolution, which aesara has nice built-in tools to handle “easily”, once your head is wrapped around the filter shape stuff. I go over the details of this here, if anyone is interested.

Here’s how I attacked your problem as posed:

with pm.Model(coords={"time": df.index.values[1:], "series": [0, 1]}) as m:
    
    rho = pm.Normal("rho", shape=order)
    
    # filter shape is (output_channels, input_channels, filter_rows, filter_cols)
    # for timeseries stuff:
    # output_channels is always 1
    # input channels is n_series
    # filter_rows is n_lags
    # filter_cols is n_series
    VAR_filters = at.expand_dims(at.eye(n_series) * rho, axis=(1, -2))    
    
    init = pm.Normal('x0', mu=0, sigma=1, shape=n_series)
    sigma = pm.HalfNormal('sigma', size=1)
    
    data = pm.Data('data', df.values[order:, :], mutable=True)
    extended_data = at.concatenate([init[None], data], axis=0)
    
    # Each convolution produces the prediction for time t+1. As a result we have to drop the last row of data,
    # since we don't have an observation that goes with it. For example, the first 4 x 3 block of data makes 
    # a precition for t=5 (4 lags)

    # Flip the lags so the first row of the output is the 1st lag, rather than the last. I think it's more intuitive that way.
    VAR_equations = at.nnet.conv2d(input=at.expand_dims(extended_data[:-1], axis=(0, 1)), # add batch_size = in_filters = 1
                                   filters=VAR_filters[:, :, ::-1, :],
                                   border_mode='valid', 
                                   filter_flip=False)
    
    mean = VAR_equations[0, :, :, 0].T
    obs = pm.Normal('obs', mu=mean, sigma=sigma, observed=data, dims=['time', 'equations'])
    
with m:
    trace = pm.sample()

In this case I’ve restricted both time series to a shared AR parameter, but this framework would let you estimate an arbitrary number of series and parameters. It could even handle heterogeneous AR order between the series, if you had the patience to build the filter tensor.

Here’s the output to show that it recovers the true parameters:

           mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  r_hat
rho[0]    0.293  0.006   0.281    0.305      0.000    0.000    3291.0    1730.0    1.0
x0[0]     0.408  0.330  -0.195    1.033      0.006    0.005    3380.0    1503.0    1.0
x0[1]    -0.546  0.325  -1.131    0.090      0.006    0.005    2669.0    1435.0    1.0
sigma[0]  0.100  0.001   0.099    0.101      0.000    0.000    3142.0    1526.0    1.0

I think you were seeing this bug: Implied dimensions of RVs are not broadcasted by dims or observed · Issue #5993 · pymc-devs/pymc · GitHub

If you pass shape explicitly (shape=(2, 1000)), besides dims, I think it will work.

Alternatively, if you remove all the shapes=1 (except rho?) from the inputs it might work without shape already… basically if without dims and observed the shape of the AR would be (2000,) and not (1, 2000)

That didn’t seem to work for me.

Did specifying shape work?

Specifying shape=2 worked, but that’s a different model.

I assume you mean shape=(2, 1000).

It is not a different model, why do you say that?

Oh, no I didn’t try explicitly specifying the shape of the observed data. If ditching the coords was an option, then I would just use 2 AR likelihoods, each with the same parameters.

You can use dims, the problem is that due to the linked bug, shape is not being properly inferred by them so you need to be redundant and specify shape as well.

Code will benefit from vectorization if you use a single likelihood, so I would advise it.

Specifying both dims=("series", "time") and shape=(2, 1000) I get: AssertionError: Steps do not match last shape dimension

That’s because I missed a zero. That works!

with pm.Model(coords={"time": df.index.values, "series": [0, 1]}) as m:
    rho = pm.Normal("rho")
    init = pm.Normal.dist()
    sigma = pm.HalfNormal('sigma')
    ar1 = pm.AR("ar1",
                rho=rho,
                sigma=sigma,
                init_dist=init,
                observed=df[['res1', 'res2']].to_numpy().T,
                dims=("series", "time"),
                shape=(2, 10000)
    )
2 Likes

Thanks everyone.

In the thread for this bug, @michaelosthege writes: " As a general advice to everybody: The best way to avoid shape issues (like the one this issue originally was about) is to not rely on “smart” broadcasting, but do all broadcasts/resizes intentionally."

I’m trying to grok what that would mean in this case.

Well it’s a PyMC bug, I disagree with the statement.

Users shouldn’t have to overthink around bugs. In the meantime, using explicit shape will fix it, but in the future your original example will be perfectly valid.