I’m back to working with pymc3 again after a couple of years where I haven’t had the chance to use it, so that is great. But, I’m still a noob when it comes to this and I feel that I get stuck on the same issues I had when I last used it, and that is most models that are more advanced than the basic tutorial stuff on real data always have some issues with the sampling (divergent transitions, low number of effective samples, acceptance probability does not match…). For hierachical glm’s the parameterising using the non-centered parameterisation is awesome and solves a lot of problems, but unfortunately not all problems.
So my first question is sort of a general one, how do you go about fixing sampling issues? Are there good tutorials for how to work around common issues in common models? I’ve found quite a lot of material explaining what pathological issues might occur while sampling, but apart from the re-parameterisation mentioned above not a lot on how to actually fix them, but maybe that is a sign on that it is not possible to give general advice?
However, if we want to get more specific, I’m currently trying to implement the Rolling regression model (https://docs.pymc.io/notebooks/GLM-rolling-regression.html) on my own data. So the model looks like:
with pm.Model() as model_randomwalk:
sigma_alpha = pm.Exponential('sigma_alpha', lam=5.)
sigma_beta = pm.Exponential('sigma_beta', lam=5.)
alpha = pm.GaussianRandomWalk('alpha', sigma=sigma_alpha, shape=len(df))
beta = pm.GaussianRandomWalk('beta', sigma=sigma_beta, shape=len(df))
sigma = pm.HalfNormal('sigma', sigma=1)
regression = alpha + beta*df['za']
likelihood = pm.Normal('y', mu=regression, sigma=sigma, observed=df['zs'])
trace_rw = pm.sample(5000, tune=5000, target_accept=.99)
The data is here if anyone want to play with it https://drive.google.com/file/d/1KIEk5NrK7jR5q-uR-tcQExJGxcd_FfCO/view?usp=sharing
My sigma_alpha and sigma_beta parameters does not seem sample well (n effective samples <100, etc). I’ve tried different number of (tuning) steps, different target_accept, different priors, but nothing seems to help. Any tips for what I can do to fix this? Or should I accept that given this data this model is not possible to fit?