How to implement vector autoregression with interdependencies?

Hello, I’m hoping someone can help me implement a vector autoregression in pymc3 that includes interdependencies between time series.

vector autoregression described here: https://en.wikipedia.org/wiki/Vector_autoregression

I understand I can use pm.AR to model multiple time series simultaneously, however it seems the rho parameter is limited to two dimensions, implying that time series will be modeled as independent series.

I’d like to set up a model that estimates weights on a series’s own lags as well as the lagged values of other series in the model. For example, an AR(2) model with three interdependent series would require 18 (2 x 3 x 3) rho parameters. It seems rho would need to be a three dimensional tensor.

What’s the simplest way to implement a model like this?

Thank you!

I think this slight tweak to the AR code might do it. It expects a three dimensional tensor for rho, where the first dimension represents the p lags, the second and third should have same length as the number of series being modeled. The constant, if used, must be passed as a separate random variable with length equal to number of series being modeled.

import numpy as np
import theano.tensor as tt
from pymc3.distributions.continuous import get_tau_sigma, Normal, Flat
from pymc3.distributions import distribution

class VAR(distribution.Continuous):
    """
    Vector Autoregressive process with p lags for s series with estimated interdependencies between series
    
    Parameters
    ----------
    rho : tensor
       shape is [p lags, s series, s_series]
    
    constant: tensor
        shape is s series

    """
    
    @staticmethod
    def as_array_with_shape(x):
        if isinstance(x, list):
            x = np.array(x)
        try:
            shape = x.shape.tag.test_value
        except AttributeError:
            shape = x.shape
        return x, shape
        

    def __init__(self, rho, sigma=None, tau=None,
                 constant=False, init=Flat.dist(),
                 sd=None, *args, **kwargs):
        super().__init__(*args, **kwargs)
        if sd is not None:
            sigma = sd

        tau, sigma = get_tau_sigma(tau=tau, sigma=sigma)
        self.sigma = self.sd = tt.as_tensor_variable(sigma)
        self.tau = tt.as_tensor_variable(tau)

        self.mean = tt.as_tensor_variable(0.)
        
        rho, rho_shape_ = self.as_array_with_shape(rho)
        assert len(rho_shape_) == 3, 'rho must be three dimensional array or RV'
        assert rho_shape_[1] == rho_shape_[2], 'rho must contain same len in dim 2 and 3'
        
        self.p = rho_shape_[0]
        self.nseries = rho_shape_[1]
        
        if constant is not False:
            constant, constant_shape_ = self.as_array_with_shape(constant)
            assert len(constant_shape_) == 1, 'constant must be one dimmensional array or RV'
            assert constant_shape_[0] == self.nseries, 'constant must be same length as rho dim 1 and 2'
            constant = tt.as_tensor_variable(constant)

        self.constant = constant
        self.rho = rho = tt.as_tensor_variable(rho)
        self.init = init

    def logp(self, value):
        """
        Calculate log-probability of VAR distribution at specified value.
        Parameters
        ----------
        value : numeric
            Value for which log-probability is calculated.
        Returns
        -------
        TensorVariable
        """
 
        x_s = []
        for s in range(self.nseries):
            s_stacked = tt.stack([value[:,s] for i in range(self.nseries)], axis=1)
            x_s.append(tt.add(*[self.rho[i, s, :] * s_stacked[self.p - (i + 1):-(i + 1)] for i in range(self.p)]))
        x = tt.add(*x_s)
        
        if self.constant is not False:
            x += self.constant
            
        eps = value[self.p:] - x

        innov_like = Normal.dist(mu=0.0, tau=self.tau).logp(eps)
        init_like = self.init.logp(value[:self.p])

        return tt.sum(innov_like) + tt.sum(init_like)

I have the same question on implementing the bayesian VAR(p) model using pm.AR, @junpenglao would you mind providing an example on VAR(p) that will be great appreciation.

I’ve got a bare-bones implementation of a vector-ready AR distribution on my fork. You can see it here: [WIP] VAR update by ckrapu · Pull Request #4666 · pymc-devs/pymc3 · GitHub.

All you would have to do is make sure you have a rho which is 3 dimensional, with the last two dimensions being square.

Hi,
this looks very interesting. Is there a guide somewhere about how to produce model components like this - is there a standardized template, or a set of functions that need to be implemented etc, and are there any rules that need to be followed when doing it?

I am interested in trying out some similar things - and it would be nice to be able to create things like this that encapsulated the modelling details of other types of multivariate processes.

A