# Hierarchical Coefficients Calculation

Hello,
I’m building multi-level hierarchical model based on the project by “Jonathan Sedar of Applied AI Ltd”.
The dataset has the following features:

• Features:

• `metric_combined` - a score for fuel efficiency in combined driving
• `metric_extra_urban` - a score for fuel efficiency in an extra-urban driving
• `metric_urban_cold` - a score for fuel efficiency in an urban setting, cold start
• `emissions_co_mgkm` - a count of CO particulates emitted mg/km
• `trans` - the car transmission
• `fuel_type` - the car power supply
• Hierarchical information:

• `parent` - the parent company of the car manufacturer, 20 values
• `mfr` - the car manufacturer, 38 values
• Target:

• `emissions_nox_mgkm` - a count of NOx particulates emitted mg/km

For hierarchical structure both “parent” and “mfr” columns were encoded, than based on that the following code was implemented:

``````# Parent - Child map based on encoded columns ['mfr_enc', 'parent_enc']
mfr_parent_map = (dfs.groupby(['mfr_enc','parent_enc']).size()
.reset_index()['parent_enc'].values)
mfr_parent_map

# Hierarchical model
with pm.Model() as mdl_hier_pymc:

# define hyperpriors for intercept based on parent       # 1x
b0_parent_mn = pm.Normal('b0_parent_mn', mu=0, sd=10)
b0_parent_sd = pm.HalfCauchy('b0_parent_sd', beta=10)

# define hyperpriors for mfr based on parent            # 20x
b0_parent = pm.Normal('b0_parent', mu=b0_parent_mn,
sd=b0_parent_sd, shape=n_parent)
b0_mfr_sd = pm.HalfCauchy('b0_mfr_sd', beta=10)

# define priors
b0 = pm.Normal('b0_mfr', mu=b0_parent[mfr_parent_map],
sd=b0_mfr_sd, shape=n_mfr)               # 38x

b1 = pm.Normal('b1_fuel_type[T.petrol]', mu=0, sd=10)
b2a = pm.Normal('b2a_trans[T.manual]', mu=0, sd=10)
b2b = pm.Normal('b2b_trans[T.semiauto]', mu=0, sd=10)
b3 = pm.Normal('b3_is_tdi[T.True]', mu=0, sd=10)
b4 = pm.Normal('b4_engine_capacity', mu=0, sd=10)
b5 = pm.Normal('b5_metric_combined', mu=0, sd=10)
b6 = pm.Normal('b6_emissions_co_mgkm', mu=0, sd=10)

# define hierachical linear model
yest = ( b0[dfs['mfr_enc']] +
b1 * mx_ex['fuel_type[T.petrol]'] +
b2a * mx_ex['trans[T.manual]'] +
b2b * mx_ex['trans[T.semiauto]'] +
b3 * mx_ex['is_tdi[T.True]'] +
b4 * mx_ex['engine_capacity'] +
b5 * mx_ex['metric_combined'] +
b6 * mx_ex['emissions_co_mgkm'])

## StudentT likelihood with fixed degrees of freedom nu
epsilon = pm.HalfCauchy('epsilon', beta=10)
likelihood = pm.StudentT('likelihood', nu=1, mu=yest,
sd=epsilon, observed=dfs[ft_endog])

## sample
trc_hier_pymc = pm.sample(2000, chains=1, step=pm.NUTS(),
#start=start_map,
trace=pm.backends.ndarray.NDArray('traces/trc_hier_pymc'))
``````

Thus the traces/results looks like this:

Now Im wondering how based on that results can I calculate coeffs for (parent1 - mfr3 - b4_engine_capacity) etc.?
I want to answer the question: How “engine capacity” (feature) in “mini” (mfr) of “bmw” (parent) impact on “NOx emission” (target) and how it is compared to the “engine capacity” coeff of audi (mfr) in “volkswagen” (parent) group.
Should I sum up the feature + mfr + parent coeffs?