Hierarchical Bayesian model with Geographic and temporal dimensions

i’m trying to build a bayesian model that has multiple features, and observations are by state and week. I was able to either build it for multiple features and weeks, or multiple dmas and weeks, but not for all dimensions together.

this is the code for only one feature:

data_wide has the columns: state, week, total_sales and one for each feature (and a few others), with rows = #states * #weeks

tates = data_wide['state'].values
state_mapping = {state: idx for idx, state in enumerate(np.unique(states))}
indexed_states = np.array([state_mapping[state] for state in states])

# Update n_groups with the number of unique indexed states
n_groups = len(np.unique(indexed_states))
y = data_wide['total_sales'].values
x = data_wide['feature1'].values

with pm.Model() as hierarchical_model:
    # Hyperpriors
    mu_alpha = pm.Normal('mu_alpha', mu=0, sigma=10)
    sigma_alpha = pm.HalfNormal('sigma_alpha', sigma=10)
    mu_beta = pm.Normal('mu_beta', mu=0, sigma=10)
    sigma_beta = pm.HalfNormal('sigma_beta', sigma=10)

    # Priors
    alpha = pm.Normal('alpha', mu=mu_alpha, sigma=sigma_alpha, shape=n_groups)  # group-specific intercepts
    beta = pm.Normal('beta', mu=mu_beta, sigma=sigma_beta, shape=n_groups)  # group-specific slopes
    sigma = pm.HalfNormal('sigma', sigma=1)

    # Expected value
    mu = alpha[indexed_states] + beta[indexed_states] * x

    # Likelihood
    y_obs = pm.Normal('y_obs', mu=mu, sigma=sigma, observed=y)

    # Sampling
    trace = pm.sample(2000, tune=1000)

When i add multiple features im having issues with the dimensions i believe, but i dont see how i can have the right dimensions, is there a way to do this? the idea is to get coefficients by state/feature. Would a loop work? and would it change the dynamic of the model if i did that?. this is what i tried:

states = data_wide['state'].values
state_mapping = {state: idx for idx, state in enumerate(np.unique(states))}
indexed_states = np.array([state_mapping[state] for state in states])

y = data_wide['incremental_sales'].values
x = data_wide.iloc[:, 2:-5].values
n_features = len(data_wide.columns) - 7
n_groups = len(np.unique(indexed_states))

with pm.Model() as hierarchical_model:
    
    # Hyperpriors
    mu_alpha = pm.Normal('mu_alpha', mu=0, sigma=10)
    sigma_alpha = pm.HalfNormal('sigma_alpha', sigma=10)
    mu_beta = pm.Normal('mu_beta', mu=0, sigma=10, shape=(n_groups, n_features))
    sigma_beta = pm.HalfNormal('sigma_beta', sigma=10, shape=(n_groups, n_features))

    # Priors
    alpha = pm.Normal('alpha', mu=mu_alpha, sigma=sigma_alpha, shape=n_groups)
    beta = pm.Normal('beta', mu=mu_beta, sigma=sigma_beta, shape=(n_groups, n_features))
    sigma = pm.HalfNormal('sigma', sigma=1)

    # Expected value
    mu = alpha[indexed_states] + pm.math.dot(beta, x.T)

    # Likelihood
    y_obs = pm.Normal('y_obs', mu=mu, sigma=sigma, observed=y)

    # Sampling
    trace = pm.sample(2000, tune=1000)

I would try subsetting the beta “matrix” by group (row), i.e. (beta[indexed_states] * x).sum(axis=1). The expression x.dot(beta[indexed_states] might also work.

I was also struggling with this a while ago, cf. Unexpected difference in posterior estimates from two equivalent linear hierarchical models, fyi.

Unrelated: I found that pymc can work directly with pandas.DataFrame and Series objects so no need to call .values (if you want to convert use .to_numpy()).

1 Like