Hello everyone, I have some questions concerning the implementation of pymc3 in statistical forecasting and numerical integration. Here is the problem:
I want to construct a hierarchical model with Chinese provinces as different levels, and also with some numerical integration during the modeling process, I followed the original radon example on https://docs.pymc.io/notebooks/GLM-hierarchical.html?highlight=radon, but I got stuck on two problems: the first is forecasting and the latter is integration. Following the forecast example on https://docs.pymc.io/notebooks/multilevel_modeling.html?highlight=radon, I hard-coded the pymc3 model with a for loop running through different provinces of China, I was able to sample the model but the whole process take ages to complete, and I donât know how to get around the problem. I know it is not good to use for loop in pymc3 model, but I cannot find another way to deal with the problem. Second, I also want to embed my model with a numerical integration process, I donât know how it could be done on pymc3, I have tried scipy.integrate for the sake of integration, the integration equation works fine. I am stuck. Really want to ask for your insights on the problems.
Thank you very much for the help.
P.S. I also attached the pm.model codes below:
with pm.Model() as model_logistic:
sigma = pm.Bound(pm.HalfNormal, lower=0.0)('sigma', sigma=5) # pm.HalfNormal('sigma', sigma=5)
alpha = pm.Bound(pm.Normal, lower=0.0)('alpha', mu=1, sigma=1,shape=len(np.unique(indexer))) # pm.Normal('alpha',mu=1, sigma=1) # 1,1 # make boundary: https://docs.pymc.io/api/bounds.html
beta = pm.Bound(pm.Normal, lower=0.0)('beta', mu=1, sigma=1,shape=len(np.unique(indexer))) #pm.Normal('beta',mu=1, sigma=1) # 1,1
t0 = pm.Bound(pm.Normal, lower=0.0)('t0', mu=10, sigma=10,shape=len(np.unique(indexer))) # pm.Normal('t0',mu=10, sigma=10) #normal(10,10);
mu = alpha[indexer] / (1 + pm.math.exp(-(beta[indexer]*(df_res[~df_res.index.isin(t_pred_idx)].t.values-t0[indexer]))))
# observed stochastic
Y_obs = pm.Normal('Y_obs', mu=mu, sigma=sigma, observed=df_res[~df_res.index.isin(t_pred_idx)].y.values)
# ------------- The above codes work for me ------------------------------ $
# prediction
pred_set=df_res.loc[t_pred_idx]
But I also want to do some forecast, based on the sampled parameters in the model
But the model is hierarchical model, so I set different level for parameters using [indexer] to constraint the part of model only on the specific level of the data.
But without a loop, the pm.Normal() could not run.
with a loop, the process is too slowâŚ
for i in np.unique(indexer): # this is slow --> https://discourse.pymc.io/t/very-parallel-mcmc-sampling/3747
Y_pred_ls = pm.Normal('Y_pred_p'+str(i),\
mu=alpha[i] / (1 + pm.math.exp(-(beta[i]*(pred_set.loc[pred_set[prov_field+'_encoded']==indexer[i]].t.values-t0[i])))) ,\
sigma=sigma,\
shape=len(pred_set.loc[pred_set[prov_field+'_encoded']==indexer[i]].t) ) # use shape instead of a for-loop: https://docs.pymc.io/notebooks/api_quickstart.html