I’ve been following the lecture course on Statistical Rethinking (which I’m enjoying immensely). Below I’m playing with data associated with the Howell1 dataset.
It’s been a while since I used pymc, and I’ve got a reasonably basic question on what is good practice if I want to “play around with priors”.
In the example below, I’ve passed some random height data to the model, and I then sample the prior predictive. When I plot it with arviz, it seems to think I want one distribution per datapoint. If I uncomment the
observed line further down, I get a single distribution, which is what I want.
How should I structure my code to play around with this in a good way?
import numpy as np import pymc as pm import arviz as az rng = np.random.default_rng(seed=42) N = 10 H = rng.uniform(low=130, high=170, size=N) H_norm = H - H.mean() with pm.Model() as m: H_data = pm.Data("height", H_norm, mutable=True) a = pm.Normal(name="a", mu=60, sigma=10) b = pm.LogNormal(name="b", mu=0, sigma=1) mean = a + b * H_data sigma = pm.Uniform("sigma", lower=0, upper=10) W = pm.Normal( name="weight", mu=mean, sigma=sigma, # observed= rng.uniform(low=3000, high=6000, size=N) # uncomment this line ) idata = pm.sample_prior_predictive(random_seed=42) az.plot_posterior(idata, group="prior");
observed commented out:
arviz’ plot, showing one distribution per “weight” (this is undesirable):