Hello,

For this question, I’m currently using PyMC v5.0.1.

This is an extension from a previous topic, but I was hoping to get some advice on using PyMC’s `pm.CustomDist`

(or something similar) to define the likelihood such that a posterior predictive check (ppc) can be run on the custom likelihood. I understand how `pm.Potential`

can be used to tack on the custom logp, but as has been outlined in numerous topics, you can’t do anything more after that.

To recap, I have summary-level concentration data (mean and standard deviation) reported from a study and I’d like to fit these data with a custom log-normally distributed log-likelihood function. The specifics of that function are in the previous topic. I’ve attempted this two different ways as outlined below.

# 1. Use pm.CustomDist

First, I attempted to follow the documentation on `pm.CustomDist`

to manually define the `logp`

and `random`

functions:

```
import pymc as pm
from pytensor.tensor import TensorVariable
def _lognormal_logp(value: TensorVariable, mu: TensorVariable, sigma: TensorVariable, Ni_summary: TensorVariable, lnsd2_summary: TensorVariable) -> TensorVariable:
N = np.sum(Ni_summary)
term1 = (Ni_summary*value)
term2 = (Ni_summary/2.)*pm.math.log(sigma**2)
term3 = ((Ni_summary - 1)*lnsd2_summary)/(2*sigma**2)
term4 = (Ni_summary*(value - mu)**2)/(2*sigma**2)
LL = (N/2)*pm.math.log(2*np.pi) - pm.math.sum((term1 + term2 + term3 + term4))
return LL
def _lognormal_random(mu: TensorVariable, sigma: TensorVariable, size: TensorVariable) -> TensorVariable:
return pm.LogNormal.dist(mu, sigma=sigma, size=size)
```

within `pm.Model`

, I use `pm.CustomDist`

:

```
with pm.Model as model:
" <All the required model specs>"
# lnconc_summary is the model predicted concentrations (log-transformed) for the summary-level data
# sigma_summary is the estimated model error for the summary level data (log-space)
# Ni_summary is the number of animals contributing to each observed summary-level data point
# lnsd2_summary is the observed variance (log-transformed)
# lny_summary is the observed mean (log-transformed)
pm.CustomDist("conc_summary", lnconc_summary, sigma_summary, Ni_summary, lnsd2_summary, logp=lognormal_logp, random=_lognormal_random, observed=lny_summary
```

Unfortunately, this implementation gets hung up on the `logp`

function with:

`TypeError: logp() takes 2 positional arguments but 3 were given`

It would appear I’m not able to pass in the additional parameters (`Ni_summary`

, `lnsd2_summary`

) to `_lognormal_logp`

, but perhaps I’m wrong in this interpretation.

# 2. Define custom log-normal distribution

The alternative approach defines a custom log-normal distribution with the desired `logp`

using:

```
class CustomLogNormal(pm.Continuous):
def __init__(self, mu, sigma, Ni_summary, lnsd2_summary, *args, **kwargs):
super(CustomLogNormal, self).__init__(*args, **kwargs)
self.mu = mu = pt.as_tensor_variable(mu)
self.sigma = sigma = pt.as_tensor_variable(sigma)
self.N = np.sum(Ni_summary)
self.Ni_summary = Ni_summary = pt.as_tensor_variable(Ni_summary) # Number of animals per time point
self.lnsd2_summary = lnsd2_summary = pt.as_tensor_variable(lnsd2_summary) # Measured variance (in log-space)
def logp(self, x):
term1 = (self.Ni_summary*value)
term2 = (self.Ni_summary/2.)*pm.math.log(self.sigma**2)
term3 = ((self.Ni_summary - 1)*self.lnsd2_summary)/(2*self.sigma**2)
term4 = (self.Ni_summary*(value - self.mu)**2)/(2*self.sigma**2)
LL = (self.N/2)*pm.math.log(2*np.pi) - pm.math.sum((term1 + term2 + term3 + term4))
return LL
def random(self, mu, sigma, size):
return pm.LogNormal.dist(mu, sigma=sigma, size=size)
```

Then within the model, I try to define the observed data with

```
with pm.Model as model:
" <All the required model specs>"
CustomLogNormal('conc_summary', mu=lnconc_summary, sigma=sigma_summary, Ni_summary=Ni_summary, lnsd2_summary=lnsd2_summary, observed=lny_summary)
```

This implementation give the error:

`TypeError: dist() missing 1 required positional argument: 'dist_params'`

This `dist_params`

requirement might be new and I haven’t been able to figure out how to satisfy it.

Any help here would be greatly appreciated. Ultimately, I would like to be able to use the custom logp for the summary-level data in a log-normal distribution and then run a posterior predictive check on the summary-level predictions. Appending the logp with `pm.Potential`

works really nicely so let me know if expanding to a ppc for summary-level data isn’t feasible.

Thanks again for all your help.