Explanation of logp and customDist for Blackbox likelihood functions

kar_m · January 29, 2024, 2:40am

Hi,

I have implemented a blackbox likelihood function in my model, using the framework described in Using a “black box” likelihood function (numpy) — PyMC example gallery

In this code, there is a user-defined log likelihood function:

def my_loglike(theta, x, data, sigma):
    model = my_model(theta, x)
    return -(0.5 / sigma**2) * np.sum((data - model) ** 2)

The log likelihood is then calculated using an Op class and included in the model with pm.Potential:

# create our Op
logl = LogLike(my_loglike, data, x, sigma)

# use a Potential to "call" the Op and include it in the logp computation
    pm.Potential("likelihood", logl(theta))

Though this works for parameter estimation, I want to perform model comparision using WAIC or through calculating the Bayes factor for different models. I was told this can be done using pm.CustomDist and defining a logp function.

What is the logp function - is it the same as the my_loglike function? In the documentation of pm.CustomDist, an example logp function is shown as:

def logp(value: TensorVariable, mu: TensorVariable) -> TensorVariable:
    return -(value - mu)**2

Is this needed when I have already included the my_loglike function?

This might be a very basic question but I’m new to Bayesian modelling and have been trying to understand this for some time now. Thanks!

ricardoV94 · January 29, 2024, 12:04pm

The example shouldn’t have data and x be passed to __init__ which make is unnecessarily restrictive and doesn’t really match the CustomDist API. I’ll update the example, but for now your logp function would look something like:

def logp(value, x, theta):
  # `value` and `x` are ignored because they are stored inside the LogLike Op at init time
  return logl(theta)

with pm.Model() as m:
    ...  # Priors and definition of `logl` `Op`
    CustomDist("loglike", x, theta, logp=logp, observed=data)

Importantly you have to make sure your logl function return the log-likelihood of each observation, if you wish to use it for model comparison. Any summation will be done by PyMC later down the road.

kar_m · January 30, 2024, 5:48am

Thanks Ricardo, that makes sense.

To modify my code to get the log-likelihood of each observations, I’m guessing I need to change my_loglike function to not sum the likelihoods, and change the LogLike otype to pt.dvector, to get a vector of log likelihood values?

ricardoV94 · January 30, 2024, 8:30am

Yup

ricardoV94 · February 2, 2024, 1:29pm

As promised: Update blackbox likelihood example by ricardoV94 · Pull Request #631 · pymc-devs/pymc-examples · GitHub

Topic		Replies	Views
Custom Distribution with pm.CustomDist v5 modeling	4	1726	June 27, 2023
Pointwise log-likelihood for loo/waic with custom "black-box" probability distribution v5	9	729	January 19, 2024
Log-scale transform v5 development , theano , prior , modeling , pytensor	16	682	July 18, 2023
Likelihood evaluation and DensityDist Questions	2	1695	February 21, 2018
Likelihood Specification and DensityDist Questions doc	3	999	October 5, 2021

Explanation of logp and customDist for Blackbox likelihood functions

Related topics