However, although the observed values are 100 times bigger than the prior, the posterior looks identical to the prior. This would not have happened if the prior was 0 and the observed values were 100 for example.
How do i deal with this small values issue? I have tried centering the prior without any result
hi @nkaimcaudle,
Thank you for getting bach to me. I have tried exactly that, but the inference still does not make sense to me. with the same code as above, with this âcenteringâ of the distribution from prior to prior_raw, i get the following result:
what troubles me is that i believe the updated posterior mean should be close to that of the evidence, as it is constant. I believe i may need to scale everything up a lot before scaling it back down
The problem is that you assume that the std of the likelihood is 1. (the default value)
You are basically saying that you know exactly how precise your measurements are, and that they are incredibly imprecise compared to the mean, so they add no noticeable amount of information.
By the way, this very simple can be solved exactly, with no sampler involved. See the first row of this table:
I donât think prior_std = pm.Deterministic("std", prior_std_raw + std) is right; youâll end up with values much bigger than 1e-6 for the SD. I think Iâd set it up something like this:
Itâs set up in a Bayesian way with priors for the parameters as well as for the model itself; Iâm not sure if thatâs what you were trying to do in the last example.
Thank you for getting back to me!
I tried running your example, and there was a lot of divergencies. I also believe the case of having a likelihood std of 1, which @aseyboldt identified as the main problem, is still present in this solution.
Ah, sorry, I copied a line across incorrectly - it was meant to be prior = pm.Normal('prior', prior_mean_scaled, prior_sd_scaled). However, I corrected a scaling error after I initially ran it, and trying it again now Iâm getting chain failures.
Iâve incorporated @nkaimcaudleâs suggestion, and it looks like itâs working properly now
Hmm, good question. Might depend on what you want to do.
I donât think it matters for pm.sample because itâs taking the observations as input, and doesnât appear in the trace.
If you want to use pm.sample_posterior_predictive then it looks like it might. Taking the mean of the posterior['likelihood'] samples gives ~0.0088, which is way too high. As you noted earlier, prior_sd_scaled is already used, so it might be best to define an extra prior. I just experimented a bit with a large SD and it looks like the posterior is updated for the likelihood prior even if it only appears in the likelihood (I wasnât sure if that worked, seeing as the initial samples are from the observations).
This seems to me like the exact same model just less freedom due to less hierarchic construction of the model, however, the sd and mean of the posteiror(former prior) is not affected noticeably by the sampling
Im still struggling to see exactly how the observed argument changes the distribution to a likelihood. What does the RV you named likely_prior do? it is a SD argument, but saying what?
That way, the likelihoodâs standard deviation should be inferred by MCMC
Also, in @Elenchus last suggestion, I think that likely_prior = pm.Normal('likely_sd', 1) should actually be likely_prior = pm.HalfNormal('likely_sd', 1), otherwise you can end up with negative values for a standard deviation parameter.
No worries, I feel like Iâm learning a lot too. These are good questions.
Good catch, thanks @AlexAndorra - all the standard deviation priors should be half normal, possibly with a mean of 0 instead of 1
I thought that with @nkaimcaudleâs suggestion the problem was what @aseyboldt mentioned, with the relative size of the standard deviation in the prior - my model has a small SD in the prior. Looking at @AlexAndorraâs suggestion though, the SD is in the likelihood and not the prior. Whatâs the difference in how these affect the model, Alex? I gather itâs needed in the likelihood for posterior predictive checks, but shouldnât it be specified in the prior as well?
For your question about the observed argument, a likelihood is P(E|H), or the evidence given the prior, hence incorporating the observations with the prior.
If the prior_sd_scaled has a mean of 1e-6, adding this to the prior_mean_scale will shift the mean to 1e-6+1e-8, which is no longer the desired prior mean of 1e-8. Also the SD of the two summed distributions will no longer be what was the initial priorâŚ
The way I proposed adds a prior on the sigma in the Normal likelihood, so this parameter will be updated from data during sampling.
Otherwise, I seem to remember that just specifying pm.Normal('likelihood', mu=prior, observed=obs) will set sigma=1 by default in PyMC3, which would mean the sigma in the Normal likelihood is assumed to be equal to 1