Hi Boris!
Yeah, that’s the expected behavior: you need to give an 8 long vector of probabilities (your p_conditional vector) to the likelihood, as you have 8 observations – this matches each data point with the corresponding category probability.
But you’re actually interested in the underlying probability of each category (your p vector). So I think you’re almost there, you just need to tell PyMC to keep it in the trace (and actually maybe you don’t need p_conditional in the trace):
with pm.Model() as model:
p = pm.Beta('p_cat', alpha=1, beta=1, shape=n_categories)
label = pm.Bernoulli('label', p=p[category_numerical], observed=outcome)
This should give you what you’re looking for.
Just as an aside, note that the three probabilities won’t sum to 1 here – I know it’s not your question, just pointing that out 
Hope that helps 