Why are these two hierarchical models different?

I have two codes that I think should correspond to the implementation of the same hierarchical model. However, the solutions are quite different. Does someone know why?
Thanks in advance!

Version 1
with pm.Model() as model:

# Intercept
a_pop = pm.Normal('a_pop', 100.0, 30.0)
a_pop_sigma = pm.HalfNormal('a_pop_sigma', 30.0)
a_local = pm.Normal('a_local', mu=a_pop, sigma=a_pop_sigma, shape=n_countries)

b_pop = pm.Normal('b_pop', 0.0, 10.0)
b_pop_sigma = pm.HalfNormal('b_pop_sigma', 10.0)
b_local = pm.Normal('b_local', mu=b_pop, sigma=b_pop_sigma,shape=n_countries)

sigma = pm.HalfCauchy('sigma', 50.0, shape=n_countries)

# Likelihood for each country
for i, country in enumerate(countries):
    x = pm.Data(country + "x_data", xdata[country])
    y = pm.Data(country + "y_data", ydata[country])
       
    # Likelihood
    country = pm.NegativeBinomial(country, (a_local[i] * np.exp(b_local[i]*x)), sigma[i], observed=y)

Version 2
with pm.Model() as model:

# Intercept
a_pop = pm.Normal('a_pop', 100.0, 30.0)
a_pop_sigma = pm.HalfNormal('a_pop_sigma', 30.0)
a_local = pm.Normal('a_local', mu=a_pop, sigma=a_pop_sigma, shape=n_countries)


b_pop = pm.Normal('b_pop', 0.0, 10.0)
b_pop_sigma = pm.HalfNormal('b_pop_sigma', 10.0)
b_local = pm.Normal('b_local', mu=b_pop, sigma=b_pop_sigma,shape=n_countries)

sigma = pm.HalfCauchy('sigma', 50., shape=n_countries)

# Uses an exponential regression
exp = a_local[countries_idx] + np.exp(b_local[countries_idx] * df.x.values)

like = pm.NegativeBinomial('like', exp , sigma[countries_idx], observed=df.y.values)

what is “coutries_idx” in the second model?

It’s just a vector of indexes. The model has 5 countries organized in a dataframe with the columns ‘country’,‘x’,‘y’, ‘idx’. Where ‘idx’ is just a integer that maps the country name. For instance, every time that ‘country’ == ‘France’, then ‘idx’ == 0. So basically countries_idx is a vector of [0, 0, 0, 0, …, 1, 1, 1, … 4, 4, 4, …], which has the same length as the total number of rows in the data frame.
The second version is actually based in this example: https://pymc3.readthedocs.io/en/latest/notebooks/GLM-hierarchical.html

Hi,

  • I’m not sure the first model is a hierarchical one, as you have one likelihood per country. As a result, data are not pooled, information about each country cannot be shared at the higher level and parameters cannot be shrunk towards the group mean, which is what hierarchical models are all about. In other words, I think you’re modeling each country independently – i.e a no-pooling model.

  • Also, in the first model, your regression is a_local * np.exp(b_local * x), while in the second one it’s a_local + np.exp(b_local * df.x.values) – multiplication vs. addition. On a side note, I’m surprised np.exp works here; I thought pm.math.exp would have been in order :thinking:

  • Finally, I think the stds on your priors are way too large, especially for a hierarchical model. Here are good guidelines for prior choice.

Hope this helps :vulcan_salute:

1 Like

Hi, thanks for the extensive reply.
First I would tend to agree with point one, but I am defining the local priors based on the population priors and that’s what should matter in the hierarchical model. Then, I also noted my typo (although I spent hours looking for typos before posting this here!!! I only noticed it a few hours ago). When this is corrected, the two models give identical results. Therefore, I actually asked the moderators to delete the post since the question stops making sense with the correction.
Thanks for the advice on the priors! Will check that out!