While working on exercise 8, chapter 3, from “Bayesian Analysis with Python” by Osvaldo Martin, I ended up writing the following
import seaborn as sns
import pymc3 as pm
import arviz as az
import pandas as pd
import numpy as np
tips = sns.load_dataset("tips")
tip = tips['tip'].to_numpy()
idx = pd.Categorical(tips['day'],
categories=['Thur', 'Fri', 'Sat', 'Sun']).codes
groups = len(np.unique(idx))
with pm.Model() as comparing_groups_hyper:
μ_μ = pm.Normal('μ_μ', 0, sd=10)
μ = pm.Normal('μ', mu=μ_μ, sd=10, shape=groups)
σ = pm.HalfNormal('σ', sd=3, shape=groups)
y = pm.Normal('y', mu=μ[idx], sd=σ[idx], observed=tip)
trace_cg_hyper = pm.sample(1000, return_inferencedata=True)
az.summary(trace_cg_hyper)
, which works without divergences.
If I then add in an extra variable, but don’t use it anywhere, then all of a sudden I get hundreds of divergences:
import seaborn as sns
import pymc3 as pm
import arviz as az
import pandas as pd
import numpy as np
tips = sns.load_dataset("tips")
tip = tips['tip'].to_numpy()
idx = pd.Categorical(tips['day'],
categories=['Thur', 'Fri', 'Sat', 'Sun']).codes
groups = len(np.unique(idx))
with pm.Model() as comparing_groups_hyper:
μ_μ = pm.Normal('μ_μ', 0, sd=10)
σ_μ = pm.HalfNormal('σ_μ', 10)
μ = pm.Normal('μ', mu=μ_μ, sd=10, shape=groups)
σ = pm.HalfNormal('σ', sd=3, shape=groups)
y = pm.Normal('y', mu=μ[idx], sd=σ[idx], observed=tip)
trace_cg_hyper = pm.sample(1000, return_inferencedata=True)
az.summary(trace_cg_hyper)
I find this quite surprising / confusing - if anyone could help me understand why that’s the case, it would be appreciated!