Hi everyone,
I apologize if this sounds like a beginner question, but I could not find the answer by Googling or searching this forum.
Let us assume that I have a model whose posterior I have inferred using some training data, how do I infer the probability of observing a particular live data point.
For example, consider the following simple probabilistic model:
bernoulli_model = pm.Model()
with bernoulli_model:
p = pm.Normal('p', mu=0.5)
obs = pm.Bernoulli('obs', p=p, observed=[0,1,1,1,1])
How do I calculate the probability of observing a 0 using this model?
Hmmm, I dont think there is a very easy way to do it. What I would do is to build a new observed RV and evaluate on the points from the trace:
with bernoulli_model:
obs2 = pm.Bernoulli('obs2', p=p, observed=0) # input the new value
# I would like to know the posterior prob
prob = np.exp(np.asarray([[obs2.logp(point) for point in straces]
for _, straces in trace._straces.items()]))
Thank you for the response.
As far as I understand this provides me an array with the possible values.
Do you know if it is possible to summarize this into a single value?
Computing the mean and you have an estimation of the expected probability.
1 Like
After trying out the idea, it seems to work out fine.
However, unfortunately it seems to act very slowly (15 seconds) even when evaluating against 100 points for a simple linear regression model. Do you have an idea why this might be the case, and whether there is possibly a way to speed up?
Thanks again!
I think the problem is that evaluating obs2.logp
is quite slow. Maybe it is better to rewrite it as a numpy function:
from scipy.stats import bernoulli
obs2 = 0
prob = np.asarray([[bernoulli.pmf(obs2, point['p']) for point in straces]
for _, straces in trace._straces.items()])
1 Like
It seems that using numpy here works almost instantly .
1 Like