I’ve been following the lecture course on Statistical Rethinking (which I’m enjoying immensely). Below I’m playing with data associated with the Howell1 dataset.
It’s been a while since I used pymc, and I’ve got a reasonably basic question on what is good practice if I want to “play around with priors”.
In the example below, I’ve passed some random height data to the model, and I then sample the prior predictive. When I plot it with arviz, it seems to think I want one distribution per datapoint. If I uncomment the observed
line further down, I get a single distribution, which is what I want.
How should I structure my code to play around with this in a good way?
import numpy as np
import pymc as pm
import arviz as az
rng = np.random.default_rng(seed=42)
N = 10
H = rng.uniform(low=130, high=170, size=N)
H_norm = H - H.mean()
with pm.Model() as m:
H_data = pm.Data("height", H_norm, mutable=True)
a = pm.Normal(name="a", mu=60, sigma=10)
b = pm.LogNormal(name="b", mu=0, sigma=1)
mean = a + b * H_data
sigma = pm.Uniform("sigma", lower=0, upper=10)
W = pm.Normal(
name="weight",
mu=mean,
sigma=sigma,
# observed= rng.uniform(low=3000, high=6000, size=N) # uncomment this line
)
idata = pm.sample_prior_predictive(random_seed=42)
az.plot_posterior(idata, group="prior");
idata
with observed
commented out:
idata
with observed
uncommented:
arviz’ plot, showing one distribution per “weight” (this is undesirable):