Is there a convenient / idiomatic way to evaluate logposterior (bonus: logprior/likelihood) at sample points?

(I also asked the same here)

Is there a convenient / idiomatic way to evaluate logposterior (bonus: logprior/likelihood) at sample points?

(I also asked the same here)

Frequently Asked Questions

There is an internal function for evaluating logp:

Regarding slowness of logp evaluation, there is a discussion here (with the solution and explanation):

https://github.com/pymc-devs/pymc3/issues/2480

Computing out-of-sample deviance

Thanks for the reply.

I ended up using such a construct

`lnp = np.array([mvg_model.logp(trace.point(i,chain=c)) for c in trace.chains for i in range(len(trace))])`

For a trace of two chains, 2000 iterations each:

```
>>> lnp.shape
(4000,)
```

It is quite fast, the slowness was caused by the fact that I was calling logp method of the obesrved RV instead of the model in the previous variant.

You might want to know that the function you suggested returns something, but of wierd shape and values:

`lnp_strange = pm.stats._log_post_trace(trace,mvg_model)`

```
lnp_strange.shape
(2000, 2222)
```

Where 2000 is length of a single chain, and 2222 is the number of observations.

Here’s a gist https://gist.github.com/madanh/8bb0cff8e83c1be99ebd4147c866e218

Interestingly enough, the sum of those 2222 is offset from the value returned by `logp()`

(for the highest chain number) by an approximately constant amount. Hmmm

```
>>> lnp_strange.sum(axis = 1)-lnp[2000:]
array([ 2.77298357, 2.7728608 , 2.77341143, ..., 2.77374797,
2.77308282, 2.77292939])
```

oh yes `_log_post_trace`

returns the logp for each observation

FYI you can get much faster logp evaluation by doing:

```
logp = mvg_model.logp
lnp = np.array([logp(trace.point(i,chain=c)) for c in trace.chains for i in range(len(trace))])
```

oh yes _log_post_trace returns the logp for each observation

Sorry for too many questions, but do you have an idea why they don’t add up to the value returned by `logp`

? I have been looking into the code for the last half and hour but haven’t figured this out yet.

`_log_post_trace`

returns the logp of the observations, given the values in the trace. `logp`

computes the posterior density (up to a constant, the evidence) of *all* variables, observed or not. So if you have any free variables, their posterior density will be in `model.logp`

, but not in `_log_post_trace`

.

Uhmmm, could not wrap my head around that answer. So posterior of unobserved parameters, lets call them a, is

P(a|X) = P(X|a)*P(a)/P(X)

where X is the set of observations. If the observations are independent, then we could write.

P(a|X) = \prod_i(P(x_i|a))*P(a)/P(X)

So is this correct that what `_log_post_trace`

returns P(x_i|a) i.e. the multiplicands of the likelihood, while `logp`

returns the entire numerator? Or something else is the case?

It did help. I see that I was sloppy in my previous post talking about probabilities (without log). Thanks.

How to save logp values during sampling?