I have looked at the two links, but I think the problems solved are different from the one I am trying to solve.
- The number of classes is known and fixed (2 or 4 in my case)
- Per spectrum I know the label Y (to which class the spectrum belongs)
In the end, the goal is to introduce shared variables for Y and the spectral observations (x_s) so that (after inference) I can:
- set the variable Y=z and sample from the distribution z
- set a (unseen) spectrum x_s[a] and sample from the posterior distribution label, so I can calculate to which class the spectrum belongs.
An ideal solution would be something like this:
# labels (observed from the data, 0 or 1)
theta = pm.Beta('theta', 1., 1.)
label = pm.Bernoulli('label', p=theta, observed=Y)
sigma_e = pm.Gamma('sigma_e', alpha=1., beta=1.)
epsilon = pm.HalfNormal('epsilon', sd=sigma_e)
y_pred = pm.Normal('y_pred', mu=y_[label], sd=epsilon, observed=x_s)
In this way, whatever the label code is ([0|1] in this case, but [0|1|2|3] for 4 classes), but that gives the error:
`TypeError: list indices must be integers or slices, not ObservedRV`