Multivariate Time Series very slow with sampling

Aditya_Kaushik · December 7, 2019, 7:49am

Hi, I was hoping for some help figuring out why this model is so slow. It only samples ~2 per second, so it takes a very long time to generate anything useful.

Here is the code. There are 30 teams each with an index in the dataset and the values of the observations are interger valued. This is supposed to be a state space model, which is why the latent variable is modeled as an AR1 (though bc. it’s multivariate I had to use AR rather than AR1).

with pm.Model() as fg3m_AR:
      comb_off_coef = pm.Normal('comb_off_coef',mu = 1, sigma = 20, shape = (1,30) )
      comb_def_coef = pm.Normal('comb_def_coef',mu = 1, sigma = 20, shape = (1,30))

      comb_off_ar1 = pm.Bound(pm.AR,lower = 0.0)('comb_off_ar1',rho = comb_off_coef, sigma = 10, shape = (81,30))
      comb_def_ar1 = pm.Bound(pm.AR,lower = 0.0)('comb_def_ar1',rho = comb_def_coef, sigma = 10, shape = (81,30))

      comb_sd_star = pm.Gamma("comb_sd_star", alpha = 1/3,beta = 1/9, shape = 30)

      home_theta = pm.Deterministic('home_theta',comb_off_ar1[fg3m.GAME_NUMBER_x.values,fg3m.i_home.values] 
                              + comb_def_ar1[fg3m.GAME_NUMBER_y.values,fg3m.i_away.values])

      away_theta = pm.Deterministic('away theta',comb_off_ar1[fg3m.GAME_NUMBER_y.values,fg3m.i_away.values] 
                              + comb_def_ar1[fg3m.GAME_NUMBER_x.values,fg3m.i_home.values]

      y_home = pm.Normal('y_home',mu = home_theta, sd = comb_sd_star[fg3m.i_home.values] , observed = fg3m.FG3M_x.values)

      y_away = pm.Normal('y_away',mu = away_theta, sd = comb_sd_star[fg3m.i_away.values], observed = fg3m.FG3M_y.values)

Gon_F · December 8, 2019, 12:14am

Hey, Aditya_Kaushik. While giving a full explanation would take very long, I used to wonder the same thing myself and so I will save you a lot of time by just letting you know that Pymc3 does not do time series very well at the moment.

There is no inherent reason for this, other than that A) reparametrizing time-series efficiently is a difficult thing, and I believe Stan has much better time-series functionality purely because they accomplish this specific aspect better, and B) the samplers in use were not built for time-series to begin with (unlike SMC, which Pymc3 recently incorporated, but still needs some more work to handle time series as well as it hypothetically can), with reparametrizing only being a stop-gap to not having a dedicated sampler.

These are some things I picked up over time, but I may be wrong, and I hope other users/developers maybe can lend their opinions on the topic ^^; .

There are other threads around this forum, such as in here, that demonstrate similar findings. I wish you the best of luck!

Aditya_Kaushik · December 9, 2019, 10:59pm

Hi Gon_F, thank you very much for this response. You have saved me a lot of future headache. Would you recommend Stan in python/R for the type of analysis I want to perform then? Or are there libraries that you think are better.

Gon_F · December 13, 2019, 4:36am

For Bayesian time-series analysis, I would really only recommend Stan at the moment, there is not much else that is as flexible, and also fast enough to actually work. Stan takes more effort to work in, but there are enough examples on what you want to accomplish.

junpenglao · December 13, 2019, 8:05am

I haven’t actually play with the time series components in Stan, do they have some special API?

Otherwise, Bayesian state space model in tensorflow probability is really good in terms design (you can do almost every kind of time series using that) however it is currently a bit slow for large timeseries.

Gon_F · December 14, 2019, 2:17am

Stan does not have any special API for them, but based on what I’ve read, they seem to get better results for more complex models.

junpenglao · December 14, 2019, 8:45pm

Not to hijack the thread here, but I wonder how much faster if you were to write this in Stan. To me the problem is more the priors - more informative prior would likely help.

Topic		Replies	Views
How to Create Multivariate AR(1) Questions	11	1524	March 14, 2023
Sampling from multilevel multivariable regression model is very slow Questions	15	1383	May 19, 2020
Advice for speeding up large multivariate models Questions	17	2144	April 12, 2018
How to create a Vector Autoregressive AR1? Questions	1	561	December 2, 2019
Latent Space Network Model is very slow, does not converge Questions	5	870	February 18, 2019

Multivariate Time Series very slow with sampling

Related topics