Evaluate logposterior at sample points

madanh · August 8, 2017, 1:01pm

Is there a convenient / idiomatic way to evaluate logposterior (bonus: logprior/likelihood) at sample points?

(I also asked the same here)

junpenglao · August 9, 2017, 7:36am

There is an internal function for evaluating logp:
https://github.com/pymc-devs/pymc3/blob/master/pymc3/stats.py#L129

Regarding slowness of logp evaluation, there is a discussion here (with the solution and explanation):

madanh · August 9, 2017, 11:34am

Thanks for the reply.

I ended up using such a construct
lnp = np.array([mvg_model.logp(trace.point(i,chain=c)) for c in trace.chains for i in range(len(trace))])

For a trace of two chains, 2000 iterations each:

    >>> lnp.shape
    (4000,)

It is quite fast, the slowness was caused by the fact that I was calling logp method of the obesrved RV instead of the model in the previous variant.

You might want to know that the function you suggested returns something, but of wierd shape and values:
lnp_strange = pm.stats._log_post_trace(trace,mvg_model)

    lnp_strange.shape
    (2000, 2222)

Where 2000 is length of a single chain, and 2222 is the number of observations.

Here’s a gist https://gist.github.com/madanh/8bb0cff8e83c1be99ebd4147c866e218

madanh · August 9, 2017, 11:43am

Interestingly enough, the sum of those 2222 is offset from the value returned by logp() (for the highest chain number) by an approximately constant amount. Hmmm

    >>> lnp_strange.sum(axis = 1)-lnp[2000:]
    array([ 2.77298357,  2.7728608 ,  2.77341143, ...,  2.77374797,
            2.77308282,  2.77292939])

junpenglao · August 9, 2017, 12:21pm

oh yes _log_post_trace returns the logp for each observation

FYI you can get much faster logp evaluation by doing:

logp = mvg_model.logp
lnp = np.array([logp(trace.point(i,chain=c)) for c in trace.chains for i in range(len(trace))])

madanh · August 9, 2017, 12:25pm

oh yes _log_post_trace returns the logp for each observation

Sorry for too many questions, but do you have an idea why they don’t add up to the value returned by logp? I have been looking into the code for the last half and hour but haven’t figured this out yet.

aseyboldt · August 9, 2017, 3:05pm

_log_post_trace returns the logp of the observations, given the values in the trace. logp computes the posterior density (up to a constant, the evidence) of all variables, observed or not. So if you have any free variables, their posterior density will be in model.logp, but not in _log_post_trace.

madanh · August 11, 2017, 12:05pm

Uhmmm, could not wrap my head around that answer. So posterior of unobserved parameters, lets call them a, is

P(a|X) = P(X|a)*P(a)/P(X)

where X is the set of observations. If the observations are independent, then we could write.
P(a|X) = \prod_i(P(x_i|a))*P(a)/P(X)

So is this correct that what _log_post_trace returns P(x_i|a) i.e. the multiplicands of the likelihood, while logp returns the entire numerator? Or something else is the case?

aseyboldt · August 12, 2017, 2:27pm

I hope this helps:
http://mathb.in/151684
I wrote it in mathb.in to get latex support…

madanh · August 16, 2017, 8:39am

It did help. I see that I was sloppy in my previous post talking about probabilities (without log). Thanks.

Topic		Replies	Views
Evaluating log posterior (or log prior/log likelihood) of model at arbitrary inputs v5 development	1	497	August 29, 2022
Computing lppd using trace, Help to understand Questions	4	1193	April 2, 2018
Loglikelihood of posterior predictive samples Questions	9	1240	February 23, 2022
Pymc3.model likelihood evaluation for given prior sample Questions	2	536	June 22, 2018
How to calculate log posterior of a GP over a trace in pymc3 Questions	14	2265	February 2, 2018

Evaluate logposterior at sample points

Related topics