I have built a Polynomial model, of the order of 3.
Inspecting the trace plot, using Arviz, three of my parameters are not stabalizing - the traces are NOT mixing well.
I am too inexperienced to understand why this is happening and what I can do about it.
My model structure and priors are as follows.
with pm.Model() as model_poly:
α = pm.Normal(‘α’, mu=0, sd=1)
β1 = pm.Normal(‘β1’, mu=30, sd=5)
β2 = pm.Normal(‘β2’, mu=0.7, sd=1)
β3 = pm.Normal(‘β3’, mu=0.003, sd=1)
ϵ = pm.HalfCauchy(‘ϵ’, 5)
#Likelihood --- how we think the data is generated
# Note \mu is deterministic
μ = α + β1*x + β2*x**2 + β3*x**3
# This part is stochastic
Y_pred = pm.Normal('Y_pred', mu=μ, sd=ϵ, observed=y)
Interestingly though, the model fit is quite good, at least to the naked eye.
Appreciate any suggestions.
Your priors here are odd. Do you have a reason to expect that your 1st order term has a mean 30 and your 2nd order term has mean 0.7? I would maybe try setting the mean of each to 0. If you’ve standardized your predictors and outcome (which - I would guess by the values here that you haven’t. I would suggest doing so), then a standard deviation of 2 should probably be sufficient.
Agree on the priors seeming strange. What is the motivation for setting these values?
One thing that could be happening is that the data is not informative enough and you are mostly getting back the prior, at least for some parameters. You can use
az.plot_dist_comparison to easily compare the differences between prior and posterior for example
The posterior sigma is very high, ~2700. Do you have any outliers? If so you could try a StudentT rather than Normal likelihood.
Thank you very much for your suggestions.
I have not standardized or centered my data or done any pre-treatment. I will try that next.
Priors are ‘strange’ because I was experimenting. I could see that although the traces were not mixing, they were tending towards the same area … so in an iterative manner I tried setting my priors to that area after each attempt to fit the model. Is that bad practice? (Should I go and stand in the ‘naughty corner’, and hang my head in shame? )
I have not looked at plot_dist_comparison before, so will investigate further.
The posterior sigma is actually quite small in relation to the overall scale of the data … but will experiment with a studentT.
But thank you all again for your suggestions and the quick replies !!