Log-likelihood not found in InferenceData when model as pm.Potential in PYMC v5

Sreelakshmi_Sreehara · March 14, 2024, 2:01pm

Hello,

I am using PYMC version 5.10.3. I don’t see the point wise log_likelihood stored in the output InferenceData. According to Model comparison — PyMC 5.10.4 documentation, this can be added to the trace by setting idata_kwargs={"log_likelihood": True} in pm.sample or by executing pm.compute_log_likelihood with trace. But both ways I am not seeing the log_likelihood group.
Inference data with groups:
> posterior
> sample_stats

Warmup iterations saved (warmup_*).
I have a different loglikelihood computation. Since my observed values are computed in the model I am using pm.Potential (based on TypeError: Invalid Use of Observed Data Variable)as:
tot_likelihood = pm.Potential(“tot_likelihood”,pm.logp(pm.Normal.dist(mu=coordinates_array, sigma=coordinates_sigmas), value=image_coordinates)).

How is it possible to get the log_likelihood values? Your assistance in resolving this matter would be greatly appreciated.

ricardoV94 · March 14, 2024, 2:07pm

A Potential is ambiguous, because it could be a prior or a posterior (or a combination) of both, so PyMC by default does not compute it for models with Potentials. If you Potential behaves like a distribution you could wrap it in an observed CustomDist, via the logp kwarg

Sreelakshmi_Sreehara · March 15, 2024, 2:37pm

Thank you for the explanation. But I need some more clarification on how to write the CustomDist. The Potential statement is used since my observed value = image_coordinates is computed in the model. So now if I write a CustomDist with logp defined as:

def logp (image_coordinates,coordinates_array, coordinates_sigmas):
    return (pm.logp(pm.Normal.dist(mu=coordinates_array, sigma=coordinates_sigmas), value=image_coordinates))

and instead of pm.Potential I use CustomDist defined as

pm.CustomDist("likelihood", coordinates_array, coordinates_sigmas, logp=logp, observed=image_coordinates)

I will get the error “Variables that depend on other nodes cannot be used for observed data.”
Alternatively after computing loglikelihood using PM potential how can I add this to CustomDist logp?

OriolAbril · March 27, 2024, 5:31pm

TL; DR: use

pm.Normal("likelihood", mu=coordinates_array-image_coordinates, sigma=coordinates_sigmas, observed=0)

It doesn’t make sense conceptually from a Bayesian point of view to have the observed data change with the MCMC iteration, so PyMC doesn’t allow that.

From what you have said so far, it looks like you both use the model to generate observations and use the observed data to generate synthetic observations that are dependent on the MCMC interation. So your goal is to have a likelihood that assigns more probability the closer these two are. Thus, you can compute the difference and give 0 as observed values, which allows you to use a normal distribution directly too.

Topic		Replies	Views
Custom likelihood computation with no direct model-data comparison version agnostic	4	723	February 8, 2023
How to define likelihood by myself in Pymc3	4	684	October 30, 2023
Building a hierarchical model using a Black Box loglikelihood function version agnostic	1	429	September 20, 2022
Point-wise log-likelihood for black-box model in PyMC v4 v5	2	962	June 10, 2022
How to evaluate Poisson log-likelihood from arviz.InferenceData after successful PyMC fitting? v5	4	638	May 7, 2023

Log-likelihood not found in InferenceData when model as pm.Potential in PYMC v5

Related topics