Sample_prior_predictive bug, gives wrong shape for dependent variable?

kenjioman · July 29, 2019, 8:32pm

Minimum working Example:

import pandas as pd
import pymc3 as pm

foxes = pd.read_csv('https://github.com/rmcelreath/rethinking/raw/master/data/foxes.csv', sep=';')

with pm.Model() as mdl:
    a = pm.Normal('a', mu=0, sd=0.25)
    b = pm.Normal('b', mu=0, sd=0.4)
    mu = pm.Deterministic('mu', a + b * foxes.area.values.reshape(-1, 1))
    sigma = pm.Exponential('sigma', lam=1)
    weight = pm.Normal('weight', mu=mu, sd=sigma, observed=foxes.weight)
    
    prior = pm.sample_prior_predictive()

{k: prior[k].shape for k in prior.keys()}

As output I get:

{'a': (500,),
 'weight': (500, 116, 116),
 'mu': (500, 116, 1),
 'sigma': (500,),
 'b': (500,),
 'sigma_log__': (500,)}

For “weight”, I would have expected a shape of (500, 116), but instead get (500, 116, 116). Am I misunderstanding something with what sample_prior_predictive should be doing?

aseyboldt · July 29, 2019, 8:38pm

Not a bug
The problem is that you reshaped foxes.area. Your dataset has shape (N,), mu has shape (N, 1). Those are broadcast using numpy rules to (N, N). Basically, you are saying that you have N copies of each observation. To fix it just remove the reshape:

mu = pm.Deterministic('mu', a + b * foxes.area.values)

kenjioman · July 29, 2019, 8:40pm

Yeah, that did it – I didn’t realize that would make such a difference! Thanks for your help!!

Topic		Replies	Views
Shape question on sample_prior_predictive() Questions shape_issue	6	813	February 25, 2022
Sample_prior_predictive failing with shape argument Questions	7	484	May 10, 2021
Understanding shape of values returned by sample_posterior_predictive Questions	3	896	January 26, 2020
Sample_prior_predictive() failing based solely on `samples` parameter Questions	9	972	March 22, 2019
Sample_ppc shape Questions	5	2670	July 18, 2018

Sample_prior_predictive bug, gives wrong shape for dependent variable?

Related topics