I understand I can use pm.AR to model multiple time series simultaneously, however it seems the rho parameter is limited to two dimensions, implying that time series will be modeled as independent series.
I’d like to set up a model that estimates weights on a series’s own lags as well as the lagged values of other series in the model. For example, an AR(2) model with three interdependent series would require 18 (2 x 3 x 3) rho parameters. It seems rho would need to be a three dimensional tensor.
What’s the simplest way to implement a model like this?
I think this slight tweak to the AR code might do it. It expects a three dimensional tensor for rho, where the first dimension represents the p lags, the second and third should have same length as the number of series being modeled. The constant, if used, must be passed as a separate random variable with length equal to number of series being modeled.
import numpy as np
import theano.tensor as tt
from pymc3.distributions.continuous import get_tau_sigma, Normal, Flat
from pymc3.distributions import distribution
class VAR(distribution.Continuous):
"""
Vector Autoregressive process with p lags for s series with estimated interdependencies between series
Parameters
----------
rho : tensor
shape is [p lags, s series, s_series]
constant: tensor
shape is s series
"""
@staticmethod
def as_array_with_shape(x):
if isinstance(x, list):
x = np.array(x)
try:
shape = x.shape.tag.test_value
except AttributeError:
shape = x.shape
return x, shape
def __init__(self, rho, sigma=None, tau=None,
constant=False, init=Flat.dist(),
sd=None, *args, **kwargs):
super().__init__(*args, **kwargs)
if sd is not None:
sigma = sd
tau, sigma = get_tau_sigma(tau=tau, sigma=sigma)
self.sigma = self.sd = tt.as_tensor_variable(sigma)
self.tau = tt.as_tensor_variable(tau)
self.mean = tt.as_tensor_variable(0.)
rho, rho_shape_ = self.as_array_with_shape(rho)
assert len(rho_shape_) == 3, 'rho must be three dimensional array or RV'
assert rho_shape_[1] == rho_shape_[2], 'rho must contain same len in dim 2 and 3'
self.p = rho_shape_[0]
self.nseries = rho_shape_[1]
if constant is not False:
constant, constant_shape_ = self.as_array_with_shape(constant)
assert len(constant_shape_) == 1, 'constant must be one dimmensional array or RV'
assert constant_shape_[0] == self.nseries, 'constant must be same length as rho dim 1 and 2'
constant = tt.as_tensor_variable(constant)
self.constant = constant
self.rho = rho = tt.as_tensor_variable(rho)
self.init = init
def logp(self, value):
"""
Calculate log-probability of VAR distribution at specified value.
Parameters
----------
value : numeric
Value for which log-probability is calculated.
Returns
-------
TensorVariable
"""
x_s = []
for s in range(self.nseries):
s_stacked = tt.stack([value[:,s] for i in range(self.nseries)], axis=1)
x_s.append(tt.add(*[self.rho[i, s, :] * s_stacked[self.p - (i + 1):-(i + 1)] for i in range(self.p)]))
x = tt.add(*x_s)
if self.constant is not False:
x += self.constant
eps = value[self.p:] - x
innov_like = Normal.dist(mu=0.0, tau=self.tau).logp(eps)
init_like = self.init.logp(value[:self.p])
return tt.sum(innov_like) + tt.sum(init_like)
I have the same question on implementing the bayesian VAR(p) model using pm.AR, @junpenglao would you mind providing an example on VAR(p) that will be great appreciation.
Hi,
this looks very interesting. Is there a guide somewhere about how to produce model components like this - is there a standardized template, or a set of functions that need to be implemented etc, and are there any rules that need to be followed when doing it?
I am interested in trying out some similar things - and it would be nice to be able to create things like this that encapsulated the modelling details of other types of multivariate processes.