# Switching between class distributions

Hello,

I am new to probabilistic modeling/programming so please forgive me if I am using the wrong (or weird) terms.
For my research I am trying to model IR spectra by considering them as a summation of peaks, a baseline model and residual noise. The peak shapes and baseline model will evolve into a more complex model eventually, but I want to start simple and build up complexity as I go along. For now I have Gaussian peaks, a constant (y)-offset and normally distributed noise. Depending on the class to which the spectrum belongs the class y-offset is either from distribution A or B. Currently I am using generated data which looks like this:

The blue lines are from class A and red lines from class B.

The model looks something like:

``````npeaks = 3
nclasses = 2
with pm.Model() as model:
# priors
amp = pm.Uniform('amp', 0, x_s.max(), shape=(nclasses,npeaks))
mu = pm.Normal('mu', mu=np.linspace(x_val.min(), x_val.max(), npeaks), sd=50,
shape=(nclasses,npeaks), transform=pm.distributions.transforms.ordered)
sigma = pm.HalfNormal('sigma', sd=100, shape=(nclasses,npeaks))
a_ = [pm.Normal('a_%d' % i, mu=0, sd=10, shape=(len(x_s),1)) for i in range(nclasses)]

# spectral lines (per class)
y_ = [pm.Deterministic('y_%d' % i, (amp[i] * np.exp(-(x_val - mu[i])**2/(2 * sigma[i]**2))).sum(axis=1)
+ a_[i]) for i in range(nclasses)]

# labels (observed from the data, 0 or 1)
theta = pm.Beta('theta', 1., 1.)
label = pm.Bernoulli('label', p=theta, observed=Y)

# switch to spectrum class A/B depending on label
y_obs = pm.math.switch(pm.math.eq(label, 1), y_[0], y_[1])

sigma_e = pm.Gamma('sigma_e', alpha=1., beta=1.)
epsilon = pm.HalfNormal('epsilon', sd=sigma_e)

y_pred = pm.Normal('y_pred', mu=y_obs, sd=epsilon, observed=x_s)
``````

So, depending on the label in the training data (Y), the model switches to variable y_0 or y_1.

The problem I have with this basic model is this:

• I am now switching between 2 classes but also would like to switch between 4 classes, or N-classes for that matter. Normally I would use a switch case statement, but that is not available. How can I achieve this in the current context?

Any help is greatly appreciated!

Hello. This is definitely possible, and I think you need to utilize a Dirichlet prior to make this work. Look through these two resources for a primer, link1 and link2.

1 Like

I have looked at the two links, but I think the problems solved are different from the one I am trying to solve.

1. The number of classes is known and fixed (2 or 4 in my case)
2. Per spectrum I know the label Y (to which class the spectrum belongs)

In the end, the goal is to introduce shared variables for Y and the spectral observations (x_s) so that (after inference) I can:

1. set the variable Y=z and sample from the distribution z
2. set a (unseen) spectrum x_s[a] and sample from the posterior distribution label, so I can calculate to which class the spectrum belongs.

An ideal solution would be something like this:

``````# labels (observed from the data, 0 or 1)
theta = pm.Beta('theta', 1., 1.)
label = pm.Bernoulli('label', p=theta, observed=Y)

sigma_e = pm.Gamma('sigma_e', alpha=1., beta=1.)
epsilon = pm.HalfNormal('epsilon', sd=sigma_e)

y_pred = pm.Normal('y_pred', mu=y_[label], sd=epsilon, observed=x_s)
``````

In this way, whatever the label code is ([0|1] in this case, but [0|1|2|3] for 4 classes), but that gives the error:

```TypeError: list indices must be integers or slices, not ObservedRV```
1 Like