# Estimating probability of data point using inferred posterior

Hi everyone,

I apologize if this sounds like a beginner question, but I could not find the answer by Googling or searching this forum.

Let us assume that I have a model whose posterior I have inferred using some training data, how do I infer the probability of observing a particular live data point.

For example, consider the following simple probabilistic model:

``````bernoulli_model = pm.Model()
with bernoulli_model:
p = pm.Normal('p', mu=0.5)
obs = pm.Bernoulli('obs', p=p, observed=[0,1,1,1,1])
``````

How do I calculate the probability of observing a 0 using this model?

Hmmm, I dont think there is a very easy way to do it. What I would do is to build a new observed RV and evaluate on the points from the trace:

``````with bernoulli_model:
obs2 = pm.Bernoulli('obs2', p=p, observed=0) # input the new value
# I would like to know the posterior prob

prob = np.exp(np.asarray([[obs2.logp(point) for point in straces]
for _, straces in trace._straces.items()]))
``````

Thank you for the response.

As far as I understand this provides me an array with the possible values.
Do you know if it is possible to summarize this into a single value?

Computing the mean and you have an estimation of the expected probability.

1 Like

After trying out the idea, it seems to work out fine.

However, unfortunately it seems to act very slowly (15 seconds) even when evaluating against 100 points for a simple linear regression model. Do you have an idea why this might be the case, and whether there is possibly a way to speed up?

Thanks again!

I think the problem is that evaluating `obs2.logp` is quite slow. Maybe it is better to rewrite it as a numpy function:

``````from scipy.stats import bernoulli
obs2 = 0
prob = np.asarray([[bernoulli.pmf(obs2, point['p']) for point in straces]
for _, straces in trace._straces.items()])
``````
1 Like

I will try that, thanks! It seems that using numpy here works almost instantly .

1 Like