Multinomial hierarchical regression with multiple observations per group ("Bad energy issue")

junpenglao · March 2, 2019, 9:07am

Overall, your model is over-specified.

So my approach to these kind of problem is usually the following:

Understand the structure of the input. I usually do a pairplot of the predictors. In this case, it is clear that you have predictors that are common among departments, and you have N departments (usually treated as random effect)
Start with a simple model that capture the structure of the data.
In your case, since it is a multinomial regression, you need to be careful re normalization in the softmax. I find that the most stable way is to model k-1 categories and stack a columns of 0s at the end:

# make sure there is no missing data here
groupX = X_full.loc[idx['seinemaritime',:], :].iloc[:, :n_regressors-1].values

Ndpt = len(np.unique(dptmts_idx))
date_idx, date_names = X_full.index.get_level_values(1).factorize(sort=True)

with pm.Model() as hier_model:
    dpt_interp = pm.Normal('dpt_interp', 0., 1., shape=(Ndpt, n_parties-1))
    fixed_effect = pm.Normal('fixed', 0., 1., shape=(n_regressors-1, n_parties-1))
    random_effect = pm.Normal('random', 0., 1., shape=n_parties-1)
    
    results_est = dpt_interp[dptmts_idx] +\
                tt.dot(groupX, fixed_effect)[date_idx]+\
                X_full['chomage'].values[:, None]*random_effect
    results_est = tt.concatenate(tensor_list=[results_est, 
                                              tt.zeros((X_full.shape[0], 1))],
                                 axis=1)
    
    probs = pm.Deterministic('probs', tt.nnet.softmax(results_est))
    likelihood = pm.Multinomial('likelihood',
                                n=y_full.sum(axis=1),
                                p=probs,
                                observed=y_full.values)
    # might need to increase tunning
    trace = pm.sample(1000, tune=1000, cores=4, random_seed=RANDOM_SEED)

add hyperpriors. I dont think it is too useful to put hyperprior on the mus of the coefficients (again, given the softmax), but it might help to put prior on the sigmas (even hierarchical ones)

Topic		Replies	Views
Help with Hierarchical regression for discrete data v5	11	732	August 24, 2022
Improve Sampling of Multilevel Regressions Model Questions	13	1218	March 9, 2020
Hierarchical logistic regression giving non-sensible results? Do I formulate it correctly? Questions	15	843	August 4, 2020
Hierarchical linear model with two parent distributions Questions	16	1300	September 15, 2018
Bad initial energy: inf. The model might be misspecified Questions	5	3565	June 29, 2018

Multinomial hierarchical regression with multiple observations per group ("Bad energy issue")

Related topics