# Combine two distributions to new one & scale parameter modifying probability depending on category

Hi,
I am kind of a newbie to pymc and previously worked with stan and Turing. My goal is to implement a model which depends on many stimulus properties with pymc where two distributions are combined to a new one, which further serves as pobability estimate of the number of responses. Depending on the category membership X (3 levels (-1,0,1)) the response is ‘biased’ to be more or less probable. I’ve tried to implement this, yet unsucessfully.

1. How can I access mu + sd from a distribution to create a combined distribution of those like
`combined = pm.Normal('combined_distr', mu = distr1.mu+distr2.mu, sigma= distr1.sd+distr2.sd)`
I tried different variations and also to set the to be combined distributions with `pm.Normal.dist(mu,sd)` . Is there any possibility or built in math function to manipulate two distributions (addition, multiplication,…)?

2. On this combined distribution I need to implement a free level scale parameter that modifies the point for log-probability estimation on the cdf depending on category membership:
`pred = cdf(combined, level_scale * df.level + df.stimulus, observed = df.response)`
I tried to write a custom distribution function but as I want to sample the scale-parameter I would need to implement some gradient search within this log-likelihood function as well, right?
I want estimate only one scale parameter for the levels and not one for each category.

I would appreciate some ideas.

Are those individual distributions `distr1` and `distr2` sampled or just their means and stds?

Can you post the STAN model?

In the meantime you can have a look at this book port to pymc3, which includes examples of many cognitive models: pymc-resources/BCM at main · pymc-devs/pymc-resources · GitHub

Hi,
thank you for the answer and the links. I already had a look at many of them.
Currently I only have it implemented in julia with the Turing lib.

``````@model function full_model(x, y, bias_x, bias_y, levels,
n_suc, n_trials, variance)

σ²_ff ~ truncated(Normal(0, 100), 0, Inf)
pred_err_scale ~ Normal(0, 2)
log_level_scale ~ Normal(0, 3)

for i in  (f0s)
pred_bias = intercept + slope * bias_x[i]
pred_stim  = intercept + slope * x[i]
prediction_error = bias_y[i]- pred_bias
biased_prediction = pred_stim  + pred_err_scale * prediction_error
bias_distr = Normal(pred_bias, sqrt(variance))
stim_distr = Normal(y[i], sqrt(σ²_ff))
combined = combine_distributions(bias_distr, sim_distr)
p_o = cdf(combined, exp(log_levelscale) * level[i] + y[i])
n_suc[i] ~ Binomial(n_trial[i], p_o)
end
end
``````

in pymc I have sth along the lines

``````with pm.Model() as full_model:

x_bias = pm.Data('x_bias', df_.x_bias)
y_bias = pm.Data('y_bias', df_.y_bias)
y = pm.Data('y', df_.y)
x = pm.Data('x', df_.x)
level = pm.Data('level', df_.level)
nr_suc = pm.Data('nr_suc', df_.nr_suc)
nr = pm.Data('nr', df_.nr)

intercept = pm.Data('intercept', hb_intercept)
slope = pm.Data('slope', hb_slope)
var = pm.Data('var', hb_var)

sigma = pm.TruncatedNormal('sigma', mu=0, sigma=100, lower=0)
pred_err_scale = pm.Normal('pred_err_scale', mu=0, sigma=2)
log_level_scale = pm.Normal('log_level_scale', mu=0, sigma=3)

pred_bias  = intercept + slope * x_bias
pred_stim = intercept + slope * x

prediction_error = y_bias - pred_bias

biased_prediction  = pred_stim + pred_err_scale * prediction_error

bias_distr  = pm.Normal('bias_distr', mu=pred_bias, sigma=np.sqrt(var))
stim_distr = pm.Normal('stim_distr', mu=y, sigma=sigma)
combined = pm.Normal('combined',bias_distr+stim_distr)

value = np.exp(log_level_scale) * level + y
p = pm.logcdf(combined, value)

n_os = pm.Binomial('respone',n=nr, p=p, observed = nr_suc)

trace = pm.sample(draws=4000, chains=4, tune=1000, return_inferencedata=True)
``````

but the model does not work with “Initial evaluation of model at starting point failed!” error because the responses are estimated to be -inf, so my assumption was that the new distribution is not correctly implemented, as an infinite amount of responses cannot be estimated with p taking a value from 0 to 1 (which it should be limited to due to the cdf right?).

``````Initial evaluation results:
{'sigma': -4.42, 'pred_err_scale': -1.71, 'log_level_scale': -2.06, 'bias_distr': -59.35, 'stim_distr': -77.95, 'combined': -61.15, 'respone': -inf}
``````

I have also tried with the `pm.NormalMixture` and with wraping a loop around it as in the initial example but it gives me the error that elementwise calculation of logcdf is not possible or that the parameters already exist.

This is not going to work as PyMC does not yet provide functionality to combine distribution. However, since you are combining Normal distribution it makes it somewhat easier:

``````    # bias_distr  = pm.Normal('bias_distr', mu=pred_bias, sigma=np.sqrt(var))
# stim_distr = pm.Normal('stim_distr', mu=y, sigma=sigma)
# combined = pm.Normal('combined',bias_distr+stim_distr)
combined = pm.Normal('combined', mu=pred_bias + y, sigma = np.sqrt(var + sigma ** 2))
``````
1 Like

Alright, thank you! I tried it but I am still stuck with the same error message. The thing that I don’t understand is in the initial evaluation results sigma is negative even though I used the TruncatedNormal with 0 as lower bound. How can this be caused?
Any ideas what could be the problem of ‘respone’: -inf?

Recommend you to change sigma to HalfNormal instead (since you truncate at 0 with mu=0 the two is equivalent)

Yes, true makes sense, thanks. Yet the problem remains that with

``````Initial evaluation results:
{'sigma': -1.07, 'pred_err_scale': -1.66, 'log_level_scale': -2.02, 'combined': -263.89, 'respone': -inf}
``````

could it have something to do with the line `p = pm.logcdf(combined, x)`? That internally there is a problem with estimation?

Did you perhaps intend:
`p = pm.math.exp(pm.logcdf(combined, x))`?

Logcdf gives you a value in the range (-inf, 0], whereas p should be in the range of [0, 1]

1 Like

Sure, silly me. Thank you for the help! It runs now without technical problems only divergence Another question:
Is it there a possibility to reweight the combined distribution by the sd? Somthing like

``combined = pm.Normal('combined', mu=( (pred_bias*sqrt(var) + y*sigma) / (pred_bias+y)), sigma = np.sqrt(1/ ((1/var) + (1/sigma**2))))``

Yes. But you should use `pm.math.sqrt`.

`np.sqrt` and other numpy functions sometimes work but are not as reliable when using PyMC objects.

1 Like