Debugging Custom likelihood

# import theano 
from pymc3.math import exp, log

class ParetoNBD(pm.Continuous):
    """
    Custom distribution class for Pareto/NBD likelihood.
    """
    
    def __init__(self, lambda_, mu, *args, **kwargs):
        super(ParetoNBD, self).__init__(*args, **kwargs)
        self.lambda_ = lambda_
        self.mu = mu
        
    def logp(self, x, t_x, T):
        """
        Loglikelihood function for and indvidual customer's purchasing rate \lambda
        and lifetime \mu given their frequency, recency and time since first purchase.
        """
        
        log_lambda = log(self.lambda_)
        log_mu = log(self.mu)
        mu_plus_lambda = self.lambda_ + self.mu
        log_mu_plus_lambda = log(mu_plus_lambda)
        
        p_1 = x * log_lambda + log_mu - log_mu_plus_lambda - t_x * mu_plus_lambda
        p_2 = (x + 1) * log_lambda - log_mu_plus_lambda - T * mu_plus_lambda
        
        return log(exp(p_1) + exp(p_2))

This is a custom distribution I have been using from this notebook.
https://github.com/benvandyke/pydata-seattle-2017/blob/master/lifetime-value/pareto-nbd.ipynb

I am running into issues when I try to run this piece of work with a different dataset. I have some nan’s in mu when sampling and that is causing the following errors.

The derivative of RV `lambda_log__`.ravel()[5921] is zero.
The derivative of RV `lambda_log__`.ravel()[5922] is zero.
The derivative of RV `beta_log__`.ravel()[0] is zero.
The derivative of RV `s_log__`.ravel()[0] is zero.
The derivative of RV `alpha_log__`.ravel()[0] is zero.
The derivative of RV `r_log__`.ravel()[0] is zero.

A couple of questions:

  • How can I debug why some of my mu values are nan? and is there a way to overcome that?
  • Is there a way for me to use logsumexp for log(exp(p_1) + exp(p_2))?

Help is greatly appreciated!!

If you follow the step in Frequently Asked Questions - #11 by junpenglao and there is no problem of the model test point, the derivative zero error is usually when you have prior being too flat - changing them to more informative prior usually works

There is a logsumexp function in pymc3.

Thanks!! Informative prior helped solve NaN problem.

1 Like