Observing the log of Poisson RV

I have a physical model where a discrete sample, Y_{sample}, is taken (poisson process), but then that sample is measured via a separate process as Z_{observed} which is dependent on the log of the poisson RV Y_{sample} with some gaussian noise. In the latter process, Y_{sampled}==0 from the poisson sampling have Z_{observed} measured as \tau*, but Y_{sampled}>0 has Z_{observed} measured as continuous, and is sampled from a normal distribution around linear function of log(Y_{sampled}).

I’d like to infer \lambda, or rather \theta which leads to \lambda, but I’m having trouble breaking this down into the correct likelihood for Z_{observed}. I think I can split the Z_{observed} into Z_{obs,\tau*}=Z_{observed}[Z_{observed} == \tau^*] which has likelihood given by the Poisson distribution directly. But for the rest of the data, Z_{obs,else}, how should I handle? It seems like I wouldn’t want to use a zero-truncated poisson for this, because that artificially inflates the probability of non-zero $Y_{sampled}, when zero was actually a possibility, just not one that I observed in the Z_{obs,else} array. It seems like I need some way to take the Y_{sampled} RV and truncate it so as not to pass a zero to log() in the Normal likelihood.

Thoughts?

Thanks in Advance.

I would also split it up.

for Z_{observed} == \tau*, I use a Poisson likelihood directly:

with model:
    lambda_ = pm.Deterministic("lambda", f(theta, x_predictor))
    pm.Poisson("observed_tau", 
               lambda_[Z_observed == tau],
               observed=np.zeros_like(Z_observed[Z_observed == tau]))

for the rest, I would skip the Poisson latent altogether:

with model:
    pm.Normal("observed_rest", 
               mu=beta0 + beta1 * tt.log(lambda_[Z_observed ~= tau]),
               sigma=sigma_obs,
               observed=Z_observed[Z_observed ~= tau])

It’s pretty close to mixture distribution likelihood like zero inflated poisson etc

Thanks for the answer @junpenglao. I’ve implemented this, and it runs successfully, but I’m thinking there is some bias introduced for data points with low \lambda and Z_{observed} < \tau^*. We are evaluating only the odds of the data under that pm.Normal() but not the odds that we ought to be using that pm.Normal() model, which is p(Poisson(lambda[Z_observed ~= tau]) ~= 0). For example, a data point <\tau^* may seem very probable under the pm.Normal() model, but in reality is quite unlikely, as it should have been \tau^*. So we need to discount the probability of pm.Normal() by the probability of a non-zero sample from Poisson distribution.

I added the following to the code:

def CdfPosPoisson(lam):
# return the logP of all non-zero results of Poisson distribution. 
return tt.log(1-tt.exp(-lam))

with model:
LessLoD = pm.Potential('LessLoD', CdfPosPoisson(lambda[Z_observed ~= tau]))

which is just adding the logP that we would have sampled this non-zero number from the Poisson, as the CdfPosPoisson() is returning logP of 1-p(0).

This brings me to my new problem. The model runs successfully on some datasets, but not others. When it fails, it fails with zeros on diagonal of mass matrix, or if I initialize with advi, it fails on the first iteration with:
FloatingPointError: NaN occurred in optimization.
The current approximation of RV K.ravel()[0] is NaN.

Again, this only occurs on certain datasets.

I took a look at: NaN occurred in optimization with ADVI - #3 by Nadheesh
and: Get nan or inf from model.logp (model.test point) is an attestation of incorrectly configured model? - #11 by junpenglao

I don’t seem to have anything squirrely with logP:
In [18]: model.check_test_point()
Out[18]:
sigma_DT_log__ -0.77
beta0 -4.83
beta1 -4.83
K -3.91
LoD_obs -11.50
DT_obs -2986.69

and none of the test_values seem to have anything weird going on either.

This leaves your last suggestion from that thread:

  1. make sure the starting value of the approximation is valid:
point = apprx.groups[0].bij.rmap(apprx.params[0].eval())
point #check to see there is no NaN or Inf
for var in logistic_model.free_RVs:
    print(var.name, var.logp(point))

If all the steps above pass fine, there is a problem with setting up the approximation score function, which is not that easy to diagnose.

I’ve never used ADVI for inference, only for initialization. How can I check this last step using ADVI for initialization? Or can you see some other solution to this problem?

Thanks,
Daniel

1 Like

CdfPosPoisson does not have support on R, so my guess is that if you have lambda very close to 0 in the initial condition it breaks the sampler:
https://www.google.com/search?q=log(1-exp(x))&oq=log(1-exp(x))

Also you might consider implementing the cdf in a more numerical stable way (see Eq. 7 in https://cran.r-project.org/web/packages/Rmpfr/vignettes/log1mexp-note.pdf)

That’s fantastic - Thanks @junpenglao! Following the second link, my cdf is now:

def CdfPosPoisson(lam):
    # return the logP of all non-zero results of Poisson distribution.
    return tt.switch(tt.le(lam, np.log(2)),
                     tt.log(-tt.expm1(-lam)),
                     tt.log1p(-tt.exp(-lam)))

This performs quite well.
I had tried some numerical tricks for large lambda, but not for small, and the function (1-e^{-\lambda}) has stability issues at both large and small values of \lambda with a minimum around ln2, where these approximations switch.
Thanks again!