Estimating probability of data point using inferred posterior

ahmadsalim · March 27, 2018, 5:11pm

Hi everyone,

I apologize if this sounds like a beginner question, but I could not find the answer by Googling or searching this forum.

Let us assume that I have a model whose posterior I have inferred using some training data, how do I infer the probability of observing a particular live data point.

For example, consider the following simple probabilistic model:

bernoulli_model = pm.Model()
with bernoulli_model:
  p = pm.Normal('p', mu=0.5)
  obs = pm.Bernoulli('obs', p=p, observed=[0,1,1,1,1])

How do I calculate the probability of observing a 0 using this model?

junpenglao · March 27, 2018, 8:06pm

Hmmm, I dont think there is a very easy way to do it. What I would do is to build a new observed RV and evaluate on the points from the trace:

with bernoulli_model:
    obs2 = pm.Bernoulli('obs2', p=p, observed=0) # input the new value 
                                                 # I would like to know the posterior prob

prob = np.exp(np.asarray([[obs2.logp(point) for point in straces]
                          for _, straces in trace._straces.items()]))

ahmadsalim · March 28, 2018, 7:42am

Thank you for the response.

As far as I understand this provides me an array with the possible values.
Do you know if it is possible to summarize this into a single value?

junpenglao · March 28, 2018, 8:11am

Computing the mean and you have an estimation of the expected probability.

ahmadsalim · April 25, 2018, 8:32am

After trying out the idea, it seems to work out fine.

However, unfortunately it seems to act very slowly (15 seconds) even when evaluating against 100 points for a simple linear regression model. Do you have an idea why this might be the case, and whether there is possibly a way to speed up?

Thanks again!

junpenglao · April 25, 2018, 8:41am

I think the problem is that evaluating obs2.logp is quite slow. Maybe it is better to rewrite it as a numpy function:

from scipy.stats import bernoulli
obs2 = 0
prob = np.asarray([[bernoulli.pmf(obs2, point['p']) for point in straces]
                          for _, straces in trace._straces.items()])

ahmadsalim · April 25, 2018, 12:19pm

I will try that, thanks!

ahmadsalim · April 30, 2018, 6:55am

It seems that using numpy here works almost instantly .

Topic		Replies	Views
Evaluate logposterior at sample points Questions	9	4363	August 16, 2017
How to Pull Point Estimates Out of Posterior Check Questions	1	527	February 6, 2019
Best way to make point estimate predictions (using inference data) version agnostic arviz	2	469	April 6, 2023
How to get the probabilities of this PyMC3 model? Questions	3	831	March 9, 2020
Get probability of parameter given new data Questions	8	1948	January 18, 2019

Estimating probability of data point using inferred posterior

Related topics