# Exclude group inference for single parameter

Hey to everybody,

I’m about to model the following data (synthetically created) by a simple hierarchical regression model

Where `p=0` means that there is no variable `p` given for that particular y (as such, `p=0` is specified by me and could be any other value). However, the data where `p=0` contains information as it causes a significantly lower value in the dependent variable.

The model I came up with so far is as follows:

``````with pm.Model() as hierarchical_model:

# Priors on parameters
alpha = pm.Normal('alpha', mu=0.1, sd=2,shape=2)
beta = pm.Normal('beta', mu=0, sd=2,shape=2)

# Estimate of y
clr_est = alpha[data_all.group] + beta[data_all.group] * data_all.p

# Model error
eps = pm.HalfCauchy('eps', 5)

# Data likelihood
like = pm.Normal('like', mu=clr_est, sd=eps, observed=data_all.clr)
``````

The inference I get looks quite normal except that the slope parameter `beta` is also estimated in the group where `p=0`. As a side effect it seems that this would cause the sampler to converge slower.

## Problem description

In the group of observations where `p=0`, the slope parameter has no useful information and I’d rather omit it from the model. However, I’m stuck at writing down a proper model for that case as the parameter needs to be estimated in the other group.

I would really appreciate any advice!

Best
Thies_DS

Is it possible to get rid of the data with p = 0 before letting pymc fit the data?

Why not just drop the beta term for that particular group?

``````with pm.Model() as mod:
offset = pm.Normal('offset', 0, 1.)
offset_missing = pm.Normal('offset_mis', 0, 1.)
beta = pm.Normal('beta', 0, 0.5)
err = pm.HalfNormal('error', 0.5)
err_missing = pm.HalfNormal('error_uk', 0.5)
y_obs = pm.Deterministic('y_obs_', offset + beta * data.p.values[np.isfinite(data.p.values)])
y_lik = pm.Normal('y_lik_', y_obs, err, observed=data.y.values[np.isfinite(data.p.values)])
mis_lik = pm.Normal('mis_lik_', offset_missing, err_missing, observed=data.y.values[np.isnan(data.p.values)])
tr = pm.sample(1000, tune=750, chains=6, cores=2)
``````

Hey @chartl,

thanks for the suggestion. Can you elaborate a little bit more on your approach? I cannot reproduce your results with my data.

Sry for the delay, should have added it in my first post:
synthetic_dataset1.csv (24.2 KB)

Switch on `group` rather than missing:

``````with pm.Model() as mod:
o_scale = pm.HalfNormal('offset_scale', 0.1)
offset = pm.Normal('offset', 0, o_scale)
offset_missing = pm.Normal('offset_mis', 0, o_scale)
be_scale = pm.HalfNormal('be_scale', 0.1)
beta = pm.Normal('beta', 0, be_scale)
err = pm.HalfNormal('error', 0.25)
err_missing = pm.HalfNormal('error_uk', 0.25)
y_obs = pm.Deterministic('y_obs_', offset + beta * data.p.values[data.group.values==0])
y_lik = pm.Normal('y_lik_', y_obs, err, observed=data.clr.values[data.group.values==0])
mis_lik = pm.Normal('mis_lik_', offset_missing, err_missing, observed=data.clr.values[data.group.values==1])

def mytrace(trace):
vn = [x for x in trace.varnames if x[-1] != '_']
pm.traceplot(trace, vn);

mytrace(tr)

``````

1 Like

That basically solves my problem, many thanks!

One last questions:

1. Why is `y_obs` modeled as parameter although you wrote it in `pm.Deterministic`?
2. Is this inference model scalable to large datasets?

Best

1. Why is `y_obs` modeled as parameter although you wrote it in `pm.Deterministic` ?

It probably would be better called `y_pred` or `mu_y`.

1. Is this inference model scalable to large datasets?

Sure, but it would be clunky to implement in this way if you had several informatively-missing variables. For the more general case, I might write a helper function to generate a coefficient vector like so:

``````def make_coef(dat, missing_idx):
coef = tt.ones_like(dat)
coef = tt.set_subtensor(coef[missing_idx], 0)
beta = pm.Normal('beta', 0., 1.)
return beta*coef
``````

And then stack these.