Pymc tutorial page example beta_1 is very off?

In the tutorial: Introductory Overview of PyMC — PyMC dev documentation

we define true parameters:
image

then we run MCMC:

The MCMC says beta_1 mean is 1.82 with SE .01 when the true value is 2.5. Isn’t this kinda bad? Or am I misunderstanding something here?

This results table is a bit misleading. The notebook rapid-fire fits the model twice, using two sampling algorithms. First, it uses NUTS (the recommended default that incorporates gradient information), then (without reporting results!) it fits again using Slice, which is a more naive sampler. It then reports results only for the Slice sampler.

If you call az.summary on the NUTS result, you get this table, which I hope you agree is much nicer:

          mean     sd  hdi_3%  hdi_97%  mcse_mean  mcse_sd  ess_bulk  ess_tail  r_hat
alpha    1.154  0.102   0.962    1.346      0.001    0.001    6225.0    3597.0    1.0
beta[0]  1.066  0.112   0.863    1.278      0.001    0.001    5674.0    3407.0    1.0
beta[1]  2.527  0.512   1.556    3.494      0.007    0.005    6189.0    2929.0    1.0
sigma    1.009  0.074   0.888    1.161      0.001    0.001    5901.0    2897.0    1.0

gotcha.

still though, aren’t the SE’s way too low? alpha e.g. is supposed to be 1 but here it says 1.154 with SE of .001. I’m not familiar with how SE is computed here but surely we shouldn’t be 154 SE’s off on a simple simulated dataset?

Standard deviations are computed using the usual sample estimator from posterior samples.

I might be instructive if you sample from the data generating process and fit the model many times. Plenty of combinations of these four parameters could have lead to the same dataset, so it shouldn’t be surprising that we still have some uncertainty about the true parameter. You should find that the true value is inside the 94% HDI 94% of the time, but the mean of the estimates will shift around due to sampling error.

You can also experiment with increasing/decreasing the number of samples you fit on to see how that impacts the estimate. This example shows how to do this, and would be a good place to start for building intuition.

For the record, here are the OLS estimates for the same model/dataset:

                 coef    std err          t      P>|t|      [0.025      0.975]
------------------------------------------------------------------------------
const          1.1407      0.100     11.439      0.000       0.943       1.339
x1             0.8841      0.097      9.087      0.000       0.691       1.077
x2             3.3383      0.581      5.746      0.000       2.185       4.491

oh, I think I’m seeing where the misunderstanding is.

I thought mcse_mean was the standard error of the mean estimate. Rather,

  • mcse_mean is the se of the estimator for the mean of the posterior
  • the std of the posterior is “std”, what one normally calls the “standard error” of an estimator

you’ll notice for OLS the “std err” vals are the stddevs of the posterior.

1 Like

if the above is correct, then it’s all resolved. thanks so much for the detailed responses! was helpful :slight_smile:

2 Likes

Many diagnostics can be found in this paper this paper. MCSE is defined in section 3.2. “std” is the standard deviation of the posterior samples.

1 Like

yeah that’s exactly right. mcse_mean will often be nearly 0 on simple-to-sample posteriors.