Combining AR/Negative Binomial with Gaussian Random Walk

Hello,

First of all thank you to everyone who spends their time developing and maintaining this fantastic library! I’ve been trying to implement my first Bayesian model and I’ve become stuck. The original idea was to implement the paper cited in this post: Vectorized autoregressive model

However, my data is unstationary and highly dependent on time, and also is divided into categories, so I thought to utilize the GaussianRandomWalk as seen here: https://twiecki.io/blog/2017/03/14/random-walk-deep-net/

Here is my best attempt, drawing heavily from those two sources:

with  pm.Model() as stochastic_model:
    #global hyperprior for alpha parameter used in Negative Binomial
    alpha = pm.Uniform("alpha", lower=0.001, upper=0.1)
    #attemping to make coefficients that depend on time
    step_size = pm.HalfNormal('step_size', sd=np.ones(df.shape[1]), shape=df.shape[1])
    w = pm.GaussianRandomWalk('w', sd=step_size, shape=(df.shape[0], df.shape[1]))
    #hyperpriors for the precision, tau
    kappa_tau = pm.Uniform("kappa_tau", lower=5, upper=10)
    beta_tau = pm.Uniform("beta_tau", lower=2, upper=25)
    tau_l = pm.Gamma("tau_l", kappa_tau, beta_tau, shape=df.shape[1])
    alpha_l = pm.Exponential("alpha_l", alpha, shape=df.shape[1])
    #The class AR2d comes from the first link above that allows pm.AR to accept multidimensional rho
    eta_l = AR2d("eta_l", rho=w, tau=tau_l, constant=False,shape=(df.shape[0],df.shape[1]))
    y_t_l = pm.NegativeBinomial("y_t_l", tt.exp(eta_l), alpha_l, observed=df)
    
    trace = pm.sample(5000, cores=4)

My dataframe is 101 rows and 5 columns (time series with 101 observations of counts data for 5 different categories) and I hope to bring in more categories eventually. The sampling did not converge (“The chain contains only diverging samples. The model is probably misspecified”) and acceptance probabilities were rounding errors from 0. I’m trying to understand if the problem is in the ranges I choose for hyperpriors, whether the priors should be different distributions, if I’m getting the shape wrong in some instances, or if the model is conceptually flawed.

Thanks for anyone who can lend advice! If there’s more information I can provide please let me know.

I realized the way the AR2d model takes in rho and uses len(rho) as the number of lags to regress on isn’t compatible with how I’m defining the coefficients. Going to try the simple AR1 and see how it works!

Hi Samuel, how did trying AR1 work out? Interested in this, cheers

Still trying to figure it out! I’ll post when I have something interesting to report

At this point, because my model is hierarchical with several levels of priors and hyperpriors, I’m using the GaussianRandomWalk at the lowest level to model an AR1 coefficient, and ‘inheriting’ from higher levels by adding a value from a normal distribution to the GRW. If someone knows a better way for a GRW to inherit from a higher level I would be grateful to hear it!