For this question, I’m currently using PyMC v5.0.1.
This is an extension from a previous topic, but I was hoping to get some advice on using PyMC’s
pm.CustomDist (or something similar) to define the likelihood such that a posterior predictive check (ppc) can be run on the custom likelihood. I understand how
pm.Potential can be used to tack on the custom logp, but as has been outlined in numerous topics, you can’t do anything more after that.
To recap, I have summary-level concentration data (mean and standard deviation) reported from a study and I’d like to fit these data with a custom log-normally distributed log-likelihood function. The specifics of that function are in the previous topic. I’ve attempted this two different ways as outlined below.
First, I attempted to follow the documentation on
pm.CustomDist to manually define the
import pymc as pm from pytensor.tensor import TensorVariable def _lognormal_logp(value: TensorVariable, mu: TensorVariable, sigma: TensorVariable, Ni_summary: TensorVariable, lnsd2_summary: TensorVariable) -> TensorVariable: N = np.sum(Ni_summary) term1 = (Ni_summary*value) term2 = (Ni_summary/2.)*pm.math.log(sigma**2) term3 = ((Ni_summary - 1)*lnsd2_summary)/(2*sigma**2) term4 = (Ni_summary*(value - mu)**2)/(2*sigma**2) LL = (N/2)*pm.math.log(2*np.pi) - pm.math.sum((term1 + term2 + term3 + term4)) return LL def _lognormal_random(mu: TensorVariable, sigma: TensorVariable, size: TensorVariable) -> TensorVariable: return pm.LogNormal.dist(mu, sigma=sigma, size=size)
pm.Model, I use
with pm.Model as model: " <All the required model specs>" # lnconc_summary is the model predicted concentrations (log-transformed) for the summary-level data # sigma_summary is the estimated model error for the summary level data (log-space) # Ni_summary is the number of animals contributing to each observed summary-level data point # lnsd2_summary is the observed variance (log-transformed) # lny_summary is the observed mean (log-transformed) pm.CustomDist("conc_summary", lnconc_summary, sigma_summary, Ni_summary, lnsd2_summary, logp=lognormal_logp, random=_lognormal_random, observed=lny_summary
Unfortunately, this implementation gets hung up on the
logp function with:
TypeError: logp() takes 2 positional arguments but 3 were given
It would appear I’m not able to pass in the additional parameters (
_lognormal_logp, but perhaps I’m wrong in this interpretation.
The alternative approach defines a custom log-normal distribution with the desired
class CustomLogNormal(pm.Continuous): def __init__(self, mu, sigma, Ni_summary, lnsd2_summary, *args, **kwargs): super(CustomLogNormal, self).__init__(*args, **kwargs) self.mu = mu = pt.as_tensor_variable(mu) self.sigma = sigma = pt.as_tensor_variable(sigma) self.N = np.sum(Ni_summary) self.Ni_summary = Ni_summary = pt.as_tensor_variable(Ni_summary) # Number of animals per time point self.lnsd2_summary = lnsd2_summary = pt.as_tensor_variable(lnsd2_summary) # Measured variance (in log-space) def logp(self, x): term1 = (self.Ni_summary*value) term2 = (self.Ni_summary/2.)*pm.math.log(self.sigma**2) term3 = ((self.Ni_summary - 1)*self.lnsd2_summary)/(2*self.sigma**2) term4 = (self.Ni_summary*(value - self.mu)**2)/(2*self.sigma**2) LL = (self.N/2)*pm.math.log(2*np.pi) - pm.math.sum((term1 + term2 + term3 + term4)) return LL def random(self, mu, sigma, size): return pm.LogNormal.dist(mu, sigma=sigma, size=size)
Then within the model, I try to define the observed data with
with pm.Model as model: " <All the required model specs>" CustomLogNormal('conc_summary', mu=lnconc_summary, sigma=sigma_summary, Ni_summary=Ni_summary, lnsd2_summary=lnsd2_summary, observed=lny_summary)
This implementation give the error:
TypeError: dist() missing 1 required positional argument: 'dist_params'
dist_params requirement might be new and I haven’t been able to figure out how to satisfy it.
Any help here would be greatly appreciated. Ultimately, I would like to be able to use the custom logp for the summary-level data in a log-normal distribution and then run a posterior predictive check on the summary-level predictions. Appending the logp with
pm.Potential works really nicely so let me know if expanding to a ppc for summary-level data isn’t feasible.
Thanks again for all your help.