PyMC3 vs Statsmodels AR(1) model parameters estimation

Hello everyone :slight_smile:

I am struggling to wrap my head around the discrepancy I am observing in the estimates provided by statsmodels and a model I wrote in PyMC3 of an AR(1) process. Given this toy simulated data

n = 300

x = np.random.normal(31, 1, n)
x_ar = np.array([x[i] * 1 if i == 0 else x[i] + (x[i-1] * .6) for i in range(n)])

y = np.random.normal(30, 1, n)
y_ar = np.array([y[i] * 1 if i == 0 else y[i] + (y[i-1] * .6) for i in range(n)])

delta = x_ar - y_ar

I would like to reliably estimate the expected value of the difference of the two AR processes delta (which I know is 1), calculating the empirical median gives as expected a biased estimate
1
While statsmodels AR seems to retrieve the right values for all the parameters with reasonable success


On the contrary, my PyMC3 model

import pymc3 as pm

with pm.Model() as ar_model:
    
    const = pm.Normal("const", 0, 1)
    rhos = pm.Normal("rho", 0, .3, shape=1)
    sigma = pm.HalfNormal("sigma", 1)    
    obs = pm.AR(
        "ar", 
        rhos, 
        sigma=sigma, 
        observed=delta - const, 
        shape=len(delta)
    )
    
    trace = pm.sample()

Seems to converge to a biased estimate for const

Am I missing something in the specification of my model / my understanding of how AR works in PyMC3?

This has been solved by using the implementation of AR in PyMC V4 and specifying constant=True and having rho including the value for the constant.

I assume (please correct me if I am wrong) the model specification above is fundamentally different and the constant is not included in the AR process.

The PyMC V4 AR distribution indeed includes the constant as the 0th rho. Here is some relevant code from the AR distribution that shows how the constant is handled:

        def step(*args):
            *prev_xs, reversed_rhos, sigma, rng = args
            if constant_term:
                mu = reversed_rhos[-1] + at.sum(prev_xs * reversed_rhos[:-1], axis=0)
            else:
                mu = at.sum(prev_xs * reversed_rhos, axis=0)
            next_rng, new_x = Normal.dist(mu=mu, sigma=sigma, rng=rng).owner.outputs
            return new_x, {rng: next_rng}

You can see if the first branch of the if statement that when there is a constant term, the last “reversed_rho” (or the first regular rho) is used as the constant value.

There is a slight difference between the PyMC implementation and the Statsmodels, because Statsmodels uses the “Regression with ARIMA errors” formulation, as explained here. Rob Hyndman writes about the difference here. PyMC’s implementation would be an example of what Hyndman calls the ARMAX model, where \beta is the constant coefficient and x is a vector of ones, and \theta(B) = 0. So you have to do a bit of algebra to convert from one to the other.

In general it’s advisable to difference away the constant term when working with ARIMA class models, if you can. For one thing it avoids these headaches. Just centering and scaling the data will also let you omit the constant term.

2 Likes

This is extremely insightful thanks!

However, in this specific case I wanted a reliable estimate of mean of the process, I though an AR(1) model with constant term was the way to go.

Do you have any suggestion for a more appropriate approach here?

Using the AR distribution with constant=True then is a fine way to go. Just be cautious in your interpretation, as noted in the blog post I linked.

1 Like