Sorry I took so long to get back to you. I’m brand new to this community and I triggered the spam bot with my first post.
I tried to work with your suggestion, but I seem to have made the problem worse. I think the issue is that I don’t understand how the dims from coords and the index variables county_idx, state_idx, and obs_id work together. In the following code, I’ve managed to remove “state” from dims in the definition of a_county_offset, but now when I run:
pm.model_to_graphviz(hierarchical_model_varying_intercept_county_state)
I can see that my model has built a coefficient for every observation. To make the discussion easier, I’ve included some sample data, which will hopefully make my code easier to follow.
import arviz as az
import numpy as np
import pandas as pd
import pymc3 as pm
sample_data = pd.DataFrame({"state":["CA", "NV", "CA", "NV", "CA", "CA", "NV", "CA", "NV", "CA", "CA", "NV", "CA", "NV", "CA"],
"county":["Los Angeles", "Clark", "San Diego", "Washoe", "Orange", "Los Angeles", "Clark", "San Diego", "Washoe", "Orange", "Los Angeles", "Clark", "San Diego", "Washoe", "Orange"],
"disease_count": [7., 8., 9., 10., 11., 6., 7., 8., 9., 10., 5., 6., 7., 8., 9.]})
county, counties = pd.factorize(sample_data["county"], sort = True)
state, states = pd.factorize(sample_data["state"], sort = True)
coords = {
"state" : states,
"county" : counties,
"obs_id" : np.arange(sample_data.shape[0])
}```
with pm.Model(coords=coords) as hierarchical_model_varying_intercept_county_state:
state_idx = pm.Data("state_idx", state, dims = "obs_id")
county_idx = pm.Data("county_idx", county, dims="obs_id")
# Hyper Priors: State
sigma_a_state = pm.HalfNormal("sigma_a_state", 3.)
mu_a_state = pm.Normal("mu_a_state",mu=9., sigma = 2.5)
# State coefficients
a_state_offset = pm.Normal("a_state_offset", mu = 0., sigma = 1., dims="state")
a_state = pm.Deterministic("a_state", mu_a_state + sigma_a_state*a_state_offset)
# Hyper Priors: County
sigma_a_county = pm.HalfNormal("sigma_a_county", 3.)
mu_a_county = pm.Normal("mu_a_county",mu = 0., sigma = 2.5)
# County coefficients
a_county_offset = pm.Normal("a_county_offset", mu = 0., sigma = 1., dims= "county")
a_county = pm.Deterministic("a_county", a_state[state_idx] + mu_a_county + sigma_a_county*a_county_offset[county_idx])
# Model error
epsilon = pm.HalfCauchy("eps", 5.0)
disease_estimate = (
a_county[county_idx]
)
# Data likelihood
disease_like = pm.TruncatedNormal("disease_likelihood",
mu=disease_estimate,
sigma=epsilon,
observed = sample_data.disease_count.values,
dims="obs_id", lower=0.)