# Sampling a specific country from the hierarchical model example

In the GLM: Hierarchical Linear Regression article a hierarchical regression model is used to predict radon levels in different counties.

By using pm.sample the parameter’s distributions for each country can be fitted. Now that I know the parameters, how can I sample the model for a specific county only?

Full details

Full details. Code from article:

import matplotlib.pyplot as plt
import numpy as np
import pymc3 as pm
import pandas as pd
import theano

county_names = data.county.unique()
county_idx = data.county_code.values

n_counties = len(data.county.unique())

with pm.Model() as hierarchical_model:
# Hyperpriors for group nodes
mu_a = pm.Normal('mu_a', mu=0., sd=100**2)
sigma_a = pm.HalfCauchy('sigma_a', 5)
mu_b = pm.Normal('mu_b', mu=0., sd=100**2)
sigma_b = pm.HalfCauchy('sigma_b', 5)

# Intercept for each county, distributed around group mean mu_a
# Above we just set mu and sd to a fixed value while here we
# plug in a common group distribution for all a and b (which are
# vectors of length n_counties).
a = pm.Normal('a', mu=mu_a, sd=sigma_a, shape=n_counties)
# Intercept for each county, distributed around group mean mu_a
b = pm.Normal('b', mu=mu_b, sd=sigma_b, shape=n_counties)

# Model error
eps = pm.HalfCauchy('eps', 5)

radon_est = a[county_idx] + b[county_idx] * data.floor.values

# Data likelihood

Then to fit the model and determine parameter distributions:

with hierarchical_model:
hierarchical_trace = pm.sample(draws=2000, n_init=1000)

How can I now sample the model for a specific county only, for example AITKIN?

I’m assuming you mean ‘get the posterior distribution for a specific county only, after having performed inference using all counties.’ (The alternative would be to perform inference using just the observed data from a specific county, in which case the hierarchical model wouldn’t make sense).

a and b are 1-D arrays n_counties long. So you just want to get the samples for all counties, and choose only those for the county you are interested in:

a_samples = hierarchical_trace.get_values('a', thin=5)
aitkin_idx = 0  # this is a guess; the point is, each county has a unique integer index
aitkin_a_samples = a_samples[:, aitkin_idx]

That would give me estimates for the parameters a and b but not radon_like (log_radon) for a specific county