It was mentioned previously in the post below that I am inefficiently building the linear equation part of the model: Access probability p parameter from pm.BetaBinomial distribution during sample_ppc

The post above contains the code used to construct the linear equation part but I’ll also provide below:

```
with pm.Model() as model:
intercept = pm.Normal('intercept', mu=0.0, sd=5, shape=1)
link_argument = intercept
for covariate in covariates:
if covariates[covariate]['type'] == 'categorical':
shape = covariates[covariate]['encoder'].classes_.size
elif covariates[covariate]['type'] in ['metric', 'binary']:
shape = 1
sigma = pm.HalfCauchy(f'{covariate}_coeff_sigma', beta=5)
offset = pm.Normal(f'{covariate}_coeff_offset', mu=0, sd=1, shape=shape)
coeff = pm.Deterministic(f'{covariate}_coeff', 0.0 + offset * sigma)
if shape > 1:
link_argument += coeff[model_variables[covariate]]
else:
link_argument += coeff * model_variables[covariate]
omega = pm.Deterministic('omega', pm.invlogit(link_argument))
kappa = pm.Exponential('kappa', lam=1e-4)
alpha = pm.Deterministic('alpha', omega * kappa + 1)
beta = pm.Deterministic('beta', (1 - omega) * kappa + 1)
likelihood = pm.BetaBinomial(
'likelihood', alpha=alpha, beta=beta, n=model_variables['n'], observed=model_variables['y_obs']
)
```

I have two questions as I start to productionize this model:

1.) what is the more efficient way to build this model instead of the for loop that still takes into account the categorical variables?

2.) Does the inefficient building of the model affect the theano calculations that take place when sampling the posterior distribution?

If the inefficiencies only contribute to the building of the model itself, this is less of a concern to me.

Thanks for the help!