In the tutorial: Introductory Overview of PyMC — PyMC dev documentation
then we run MCMC:
The MCMC says beta_1 mean is 1.82 with SE .01 when the true value is 2.5. Isn’t this kinda bad? Or am I misunderstanding something here?
In the tutorial: Introductory Overview of PyMC — PyMC dev documentation
then we run MCMC:
The MCMC says beta_1 mean is 1.82 with SE .01 when the true value is 2.5. Isn’t this kinda bad? Or am I misunderstanding something here?
This results table is a bit misleading. The notebook rapid-fire fits the model twice, using two sampling algorithms. First, it uses NUTS (the recommended default that incorporates gradient information), then (without reporting results!) it fits again using Slice, which is a more naive sampler. It then reports results only for the Slice sampler.
If you call az.summary
on the NUTS result, you get this table, which I hope you agree is much nicer:
mean sd hdi_3% hdi_97% mcse_mean mcse_sd ess_bulk ess_tail r_hat
alpha 1.154 0.102 0.962 1.346 0.001 0.001 6225.0 3597.0 1.0
beta[0] 1.066 0.112 0.863 1.278 0.001 0.001 5674.0 3407.0 1.0
beta[1] 2.527 0.512 1.556 3.494 0.007 0.005 6189.0 2929.0 1.0
sigma 1.009 0.074 0.888 1.161 0.001 0.001 5901.0 2897.0 1.0
gotcha.
still though, aren’t the SE’s way too low? alpha e.g. is supposed to be 1 but here it says 1.154 with SE of .001. I’m not familiar with how SE is computed here but surely we shouldn’t be 154 SE’s off on a simple simulated dataset?
Standard deviations are computed using the usual sample estimator from posterior samples.
I might be instructive if you sample from the data generating process and fit the model many times. Plenty of combinations of these four parameters could have lead to the same dataset, so it shouldn’t be surprising that we still have some uncertainty about the true parameter. You should find that the true value is inside the 94% HDI 94% of the time, but the mean of the estimates will shift around due to sampling error.
You can also experiment with increasing/decreasing the number of samples you fit on to see how that impacts the estimate. This example shows how to do this, and would be a good place to start for building intuition.
For the record, here are the OLS estimates for the same model/dataset:
coef std err t P>|t| [0.025 0.975]
------------------------------------------------------------------------------
const 1.1407 0.100 11.439 0.000 0.943 1.339
x1 0.8841 0.097 9.087 0.000 0.691 1.077
x2 3.3383 0.581 5.746 0.000 2.185 4.491
oh, I think I’m seeing where the misunderstanding is.
I thought mcse_mean was the standard error of the mean estimate. Rather,
you’ll notice for OLS the “std err” vals are the stddevs of the posterior.
if the above is correct, then it’s all resolved. thanks so much for the detailed responses! was helpful
Many diagnostics can be found in this paper this paper. MCSE is defined in section 3.2. “std” is the standard deviation of the posterior samples.
yeah that’s exactly right. mcse_mean will often be nearly 0 on simple-to-sample posteriors.