Sampling without the observed keyword

Hi all,

I am trying to figure out what does the NUTS sampler do when it is not given an observed data. Intuitively I’d expect it will work like just a regular numpy random number generator, but this seems not to be the case. Here is a quick demo:

joint_model = pm.Model()
var = np.full(400,1e-6)
var[0:10] = 300
with joint_model:
    a = pm.Normal("a", mu=0, sigma=100)
    b = pm.Normal("b", mu=0, sigma=var)
    tr = pm.sample(draws=50,n_init=20,tune=20)
plt.hist(np.array(tr.posterior.stack(sample=["chain", "draw"])["a"]),bins=50)
plt.show()

Here, the result for “a” looks nothing like a normal distribution that I’d expect (instead, it is peaked at some location not close to the mean). However, things look better if say we change the sampling parameters to, say,

tr = pm.sample(draws=50,n_init=500,tune=500)

I am wondering why is this happening. Can we use/interpret pm.Normal as an RNG? Does the sampler still impose some likelihood function when not given an observed data? Thanks a lot in advance!

When you call pm.sample it uses MCMC to sample parameter values. The specific MCMC algorithm you get by default requires some tuning to work properly so limiting this stage like you did hurts convergence.

If you use pm.sample_prior_predictive instead it will work like a simple numpy random function and give you the expected results faster and without any need for tuning.

Thank you so much! This is exactly what I need!