Hello.
I’m running a model on a fairly large dataset. I’ve just taken a job with unlimited use of the google cloud platform and thought, if I chose a higher performing CPU and higher memory, the sampling would noticeably speed up.
I haven’t seen that. The data I’m sampling has around 500,000 observations. It’s a time series based on daily data for five years with multiple items forecasted per year.
I’m currently using a 56core, 112GB RAM setup. No GPU as I’ve actually never ran PyMC with numpyro/jax on a GPU. Will the GPU help?
Do any of you have suggestions to try?
Thanks.
For Reference, here is my model and versions of programs:
with pm.Model(coords = coords) as model:
# item_idx = pm.Data('item_idx', items, dims = "obs_id", mutable = False)
k = pm.Normal('k', 0, 1)
m = pm.Normal('m', 0, 5)
delta = pm.Laplace('delta', 0, 0.1, shape = n_changepoints)
growth = k + at.dot(A, delta)
offset = m + at.dot(A, -s * delta)
trend = growth * t + offset
# beta_weekly = pm.Normal('beta_weekly_seasonality', 0, 1, shape = weekly_n_components * 2)
# seasonality_weekly = at.dot(fourier(t, p = 7), beta_weekly)
# beta_monthly = pm.Normal('beta_monthly_seasonality', 0, 1, shape = monthly_n_components * 2)
# seasonality_monthly = at.dot(fourier(t, p = 30.5), beta_monthly)
# beta_yearly = pm.Normal('beta_yearly_seasonality', 0, 1, shape = yearly_n_components * 2)
# seasonality_yearly = at.dot(fourier(t, p = 365.25), beta_yearly)
error = pm.HalfCauchy('sigma', .5)
pm.Normal("predicted_sales",
trend,
error,
observed = train_y)
trace = pymc.sampling_jax.sample_numpyro_nuts(tune=1000, chains = 4)
- PyMC/PyMC3 Version: 4.0.0b6
- Aesara/Theano Version: 2.5.1
- Python Version: 3.7.12
- Operating system: Debian via Google Cloud Platform
- How did you install PyMC/PyMC3: pip install pymc --pre (Installation Guide (Linux) · pymc-devs/pymc Wiki · GitHub)