Custom log likelihood and LOO

Hi there,

If I have a model with a custom likelihood function defined with a pm.Potential combining some distributions with logp values, the returned trace seems to miss the log_likelihood attribute and az.loo complains. Can the solution be to make the likelihood values available as a pm.Deterministic and make it an attribute?

Cheers,
MV

2 Likes

Hi,

It is possible to use any log likelihood array to compute loo, it doesn’t need to be automatically retrieved when converting to InferenceData. There are some cases where getting pointwise log likelihood values automatically is not possible, there are other cases where getting the pointwise values automatically is possible but then it’s not clear how to compute loo from there.

Luckily, az.loo takes the data from the log_likelihood group of the input inferencedata, and uses this to estimate elpd_loo. You should therefore be able to add a deterministic (which will be stored in the posterior) and then “move” it to the log_likelihood group. Another option is to compute that by hand afterwards with numpy and xarray (not too different from computing by hand in a deterministic). There is an example on hand generated log likelihood data in Refitting PyMC3 models with ArviZ (and xarray) — ArviZ dev documentation

I would like to note however, that defining and/or choosing what are your pointwise log likelihood values may not be obvious, there could even be multiple correct ways of doing so, the only difference between them being the question they answer. Take a look at this notebook for example

2 Likes

Thanks for you reply and for the links, that’s exactly what I needed to go forward!

Sorry to bring an old topic back up but my question is directly related to this discussion. When I combine by likelihoods into a single potential and run az.loo I get a warning:

UserWarning: The point-wise LOO is the same with the sum LOO, please double check the Observed RV in your model to make sure it returns element-wise logp

I’m pretty confident I combine my likelihoods well and that the combination is well adapted to the problem I’m trying to solve. My question is whether there is any intrinsic difference between running the LOO diagnostic with the observed data as an array or running it combining beforehand all the likelihoods into one? In other words, is there any fundamental issue using s single likelihood for the LOO diagnostic?

Cheers,
Vian

Yes, when it comes to loo.

az.loo approximates the ELPD (expected log pointwise predictive density). The elpd is a measure of how well the model will predict on unseen data that can then be used for model comparison. To estimate the predictive accuracy on unseen data, it uses loo-cv, excluding an observation from the fit and checking how well the model predicts the excluded observation (that is, we calculate p(y_{i}|y_{-i}) the probability of the excluded observation y_i conditioned on all other observations y_{-i}. az.loo uses pareto smoothed importance sampling (PSIS) to approximate p(y_{i}|y_{-i}) without the need for any refit. Therefore, it can’t work without having the pointwise log likelihood available.

No, when it comes to fitting.

Sampling uses only the total unnormalized log probability of the model and it’s gradient. It doesn’t even distinguish between the contributions of the prior and of the likelihood. From what you say I guess you have a complicated log likelihood term that you compute somehow (might not even be pointwise log likelihood at any point) and then use pm.Potential to add it to the model logp as well as storing it for later use with az.loo. As I said, the model only uses the total logp so your posterior will be the same because only the scalar added to the model logp is taken into account, the storage of deterministics and arbitrary variables has no effect whatsoever on the fit. But even if you have the right posterior, without also the right pointwise log likelihood values you can’t use az.loo.

I have no reason to doubt your model implementation is right, and this is probably true. What is not right is the pointwise log likelihood computation, otherwise you would not get the UserWarning you shared.

That’s very clear, thanks!

1 Like