Pointwise log-likelihood for loo/waic with custom "black-box" probability distribution

Hi,

I’d like to estimate a model (and then do some model comparisons) using pymc. The model for generating the predicted probability distribution itself depends on an external module and requires a numpy array as input. So, while I have control over how the likelihood is calculated, I have no control over how the model prediction is obtained (numpy array in → model predictions as numpy array back). I followed the guide on how to use a “black box” likelihood to wrap an instance of my model (which internally calls the external code, handles the observed data, likelihood calculation etc.) in a custom pytensor op to call and return the log_likelihood.

class PtDdmLogLike(pt.Op):

  itypes = [pt.dvector] # expects a vector of parameter values when called
  otypes = [pt.dscalar] # returns a single scalar value (the log likelihood)

  def __init__(self, ddm_model):
      # set up my custom backend model instance
      self.ddm_model = deepcopy(ddm_model) 

  def perform(self, node, inputs, outputs):
      (theta,) = inputs # unpack inputs

      # modify the parameters of the model
      self.ddm_model.set_prms(theta)

      # Call the log-likelihood function (the observed data is an attribute of ddm_model)
      logl = self.ddm_model._log_like()

      outputs[0][0] = np.array(logl) # output the log likelihood

Then I use a potential to call the op…

data = pd.read_csv(...)
my_model = model(data) # create an instance of my model

with bm_model:
    priorA = ...
    priorB = ...
    theta = [priorA , priorB]
    pt.as_tensor_variable(theta)
    
    # create our plugin Op instance
    logl = PtDdmLogLike(my_model) # plug in my model instance
    # Use a potential to "call" the Op
    pm.Potential("Likelihood", logl(theta))
    custdata = pm.sample(...)

This all works fine. However, I would like to do some model comparisons, and, as far as I can see, many of the techniques require the calculation of WAIC/LOO, for which I need access to the pointwise log-likelihood estimates of each individual data point. This is not a problem in principle, as I could easily modify the _log_like() function of my model instance to return a vector of log-likelihoods for each individual observation. However, I am struggling with how to connect it within the pymc framework.

From a search on pymc discourse, I found this question, which is highly related to my question. However, the links provided seem to be broken, and older versions of the corresponding article use code that is no longer available in newer versions of pymc/arviz.

I also found questions/answers that suggested using pm.Deterministic or pm.DensityDist, but they left open how to implement it exactly or were not suitable for my case.

Thus, is there a tutorial which demonstrates how to call a custom “black box” likelihood function (which requires a numpy array of parameters as input) and subsequently calculate loo/waic?

All help is much appreciated :slight_smile:

3 Likes

Did you figure out a solution? I am stuck on the same problem

The easiest option is to define a CustomDist with a logp function

How exactly can I do this? I’ve read posts about using DensityDist but I’m not sure how to implement this.

I currently use pm.Potential(“likelihood”, logl(theta)), which works well, but it does not let me compare models using WAIC or LOO.

I then tried pm.DensityDist(“likelihood”, logp = logl(theta), observed = data), but this results in an error Logprob method not implemented for CustomDist_likelihood_rv{0, (), floatX, False}. I guess my implementation of density dist is wrong but I am quite new to this and not sure how to proceed.

Any help would be appreciated

Check the code examples here: pymc.CustomDist — PyMC dev documentation

You have to pass the parameters function and the logp function as a kwarg (not yet evaluated)

Thanks for your quick response and the link to the examples!

I tried following the first example in that link, but still get a NotImplemented error. The error is occuring when I perform my sampling:

idata_mh = pm.smc.sample_smc(draws=1000, cores=10, chains=4, return_inferencedata=True, idata_kwargs=dict(log_likelihood=True))

When you say I need to pass the parameters function, is this refering to mu in logp, or the parameters that I am trying to infer? Further, should I be specifying the value variable in the logp function?

You need to pass all the parameters that the logp function uses. The value variable will be provided by PyMC and will be the same as your observations

Can you share the code you tried?

It is similar to the code provided in the black-box likelihood tutorial, with the first simple example in the custom dist documentation.

def my_loglike1(theta, freq, data, sigma):
    model = my_model1(theta, freq)
    return -(0.5 / sigma**2) * np.sum((data - model) ** 2)

class LogLike(pt.Op):

    itypes = [pt.dvector]  # expects a vector of parameter values when called
    otypes = [pt.dscalar]  # outputs a single scalar value (the log likelihood)

    def __init__(self, loglike, data, freq, sigma):
 
        # add inputs as class attributes
        self.likelihood = loglike
        self.data = data
        self.freq = freq
        self.sigma = sigma

    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        (theta,) = inputs  # this will contain my variables

        # call the log-likelihood function
        logl = self.likelihood(theta, self.freq, self.data, self.sigma)

        outputs[0][0] = np.array(logl)  # output the log-likelihood

data = sigma * rng.normal(size=N) + truemodel
logl1 = LogLike(my_loglike1, data, freq, sigma)

def logp(value: TensorVariable, mu: TensorVariable) -> TensorVariable:
    return -(value - mu)**2


# use PyMC to sampler from log-likelihood
if __name__ == '__main__':
    
    with pm.Model() as BI_model1:
        # uniform priors 
        d_c = pm.Uniform('d_c_model1',lower=0.001, upper=0.009)
        a_c = pm.Uniform("a_c_model1", lower=0.015, upper=0.09)

        mu = pm.Normal('mu',0,1)

        # convert to a tensor vector
        theta = pt.as_tensor_variable([d_c, a_c])

        # use a Potential to "call" the Op and include it in the logp computation
        pm.Potential("likelihood", logl1(theta))

        pm.CustomDist('custom_dist',mu, logp=logp, observed=data)

        # Use custom number of draws to replace the HMC based defaults
        idata_mh = pm.smc.sample_smc(draws=100, cores=10, chains=4, return_inferencedata=True, idata_kwargs=dict(log_likelihood=True))

I don’t see anything obviously wrong with the code. Can you provide a fully executable snippet? Also what error message were you seeing exactly?

This is a very simple representation of my_model:

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pymc as pm
import pytensor.tensor as pt
from pytensor.tensor import TensorVariable


def my_model1(theta, freq):
    d_c1, a_c = theta

    d1 = 0.0075
    d2 = 0.0625

    LossE_list = []
    for k in range(0, len(freq)):
        f = freq[k]
        LossE = (f**2)*d1*d_c1 + (f)*d2*(a_c**2)
        LossE_list.append(LossE)

    stacked_LossE = np.array(LossE_list)

    return stacked_LossE

Defining my log likelihood and LogLike class:

def my_loglike1(theta, freq, data, sigma):
    model = my_model1(theta, freq)
    return -(0.5 / sigma**2) * np.sum((data - model) ** 2)

# define a pytensor Op for our likelihood function
class LogLike(pt.Op):

    itypes = [pt.dvector]  # expects a vector of parameter values when called
    otypes = [pt.dscalar]  # outputs a single scalar value (the log likelihood)

    def __init__(self, loglike, data, freq, sigma):

        # add inputs as class attributes
        self.likelihood = loglike
        self.data = data
        self.freq = freq
        self.sigma = sigma

    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        (theta,) = inputs  # this will contain my variables

        # call the log-likelihood function
        logl = self.likelihood(theta, self.freq, self.data, self.sigma)

        outputs[0][0] = np.array(logl)  # output the log-likelihood

Setting up some example data points:

freq = np.arange(1,2000,5)
N = len(freq)
sigma = 0.05  # standard deviation of noise
d_c1_true = 0.003  # true gradient
a_c_true = 0.04  # true y-intercept

truemodel = my_model1([d_c1_true, a_c_true], freq)

# make data
rng = np.random.default_rng(716743)
data = sigma * rng.normal(size=N) + truemodel

Creating Op and defining the example logp function:

# create our Op
logl1 = LogLike(my_loglike1, data, freq, sigma)


def logp(value: TensorVariable, mu: TensorVariable) -> TensorVariable:
    return -(value - mu)**2

Sampling the model:

if __name__ == '__main__':
    
    with pm.Model():
        # uniform priors 
        d_c1 = pm.Uniform('d_c1_model1',lower=0.001, upper=0.009)
        a_c = pm.Uniform("a_c_model1", lower=0.015, upper=0.09)

        mu = pm.Normal('mu',0,1)

        # convert to a tensor vector
        theta = pt.as_tensor_variable([d_c1, a_c])

        # use a Potential to "call" the Op and include it in the logp computation
        pm.Potential("likelihood", logl1(theta))

        pm.CustomDist('custom_dist',mu, logp=logp, observed=data)

        # Use custom number of draws to replace the HMC based defaults
        idata_mh = pm.smc.sample_smc(draws=100, cores=10, chains=4, return_inferencedata=True, idata_kwargs=dict(log_likelihood=True))

The exact error message is:

line 94, in <module>
    idata_mh = pm.smc.sample_smc(draws=100, cores=10, chains=4, return_inferencedata=True, idata_kwargs=dict(log_likelihood=True))
NotImplementedError: Logprob method not implemented for CustomDist_custom_dist_rv{0, (0,), floatX, False}