Observing the log of Poisson RV

Boilermaker14 · December 16, 2020, 2:42am

Thanks for the answer @junpenglao. I’ve implemented this, and it runs successfully, but I’m thinking there is some bias introduced for data points with low \lambda and Z_{observed} < \tau^*. We are evaluating only the odds of the data under that pm.Normal() but not the odds that we ought to be using that pm.Normal() model, which is p(Poisson(lambda[Z_observed ~= tau]) ~= 0). For example, a data point <\tau^* may seem very probable under the pm.Normal() model, but in reality is quite unlikely, as it should have been \tau^*. So we need to discount the probability of pm.Normal() by the probability of a non-zero sample from Poisson distribution.

I added the following to the code:

def CdfPosPoisson(lam):
# return the logP of all non-zero results of Poisson distribution. 
return tt.log(1-tt.exp(-lam))

with model:
LessLoD = pm.Potential('LessLoD', CdfPosPoisson(lambda[Z_observed ~= tau]))

which is just adding the logP that we would have sampled this non-zero number from the Poisson, as the CdfPosPoisson() is returning logP of 1-p(0).

This brings me to my new problem. The model runs successfully on some datasets, but not others. When it fails, it fails with zeros on diagonal of mass matrix, or if I initialize with advi, it fails on the first iteration with:
FloatingPointError: NaN occurred in optimization.
The current approximation of RV K.ravel()[0] is NaN.

Again, this only occurs on certain datasets.

I took a look at: NaN occurred in optimization with ADVI - #3 by Nadheesh
and: Get nan or inf from model.logp (model.test point) is an attestation of incorrectly configured model? - #11 by junpenglao

I don’t seem to have anything squirrely with logP:
In [18]: model.check_test_point()
Out[18]:
sigma_DT_log__ -0.77
beta0 -4.83
beta1 -4.83
K -3.91
LoD_obs -11.50
DT_obs -2986.69

and none of the test_values seem to have anything weird going on either.

This leaves your last suggestion from that thread:

make sure the starting value of the approximation is valid:
point = apprx.groups[0].bij.rmap(apprx.params[0].eval())
point #check to see there is no NaN or Inf
for var in logistic_model.free_RVs:
    print(var.name, var.logp(point))
If all the steps above pass fine, there is a problem with setting up the approximation score function, which is not that easy to diagnose.

I’ve never used ADVI for inference, only for initialization. How can I check this last step using ADVI for initialization? Or can you see some other solution to this problem?

Thanks,
Daniel

Topic		Replies	Views
Poisson realization of log-normal distribution Questions	4	818	January 28, 2021
Poisson likelihood over a Gaussian process Questions	4	1364	November 13, 2019
What is the best way to estimate theta and psi for ZeroInflatedPoisson? version agnostic modeling	6	967	May 27, 2022
"Generalized ZeroInflatedPoisson" / observed deterministic variable Questions	2	451	December 4, 2020
Observed data dependent on random variables Questions	8	646	December 30, 2020

Observing the log of Poisson RV

Related topics