Problems sampling a Poisson single hit model

Hello

I’m having trouble sampling from a model. The likelihood of the data comes from a Poisson single hit process which simplifies to: \text{Bernoulli}(p=f(x, v)) with f(x, v) = 1-2^{-10^{x+v}}.

Here’s some test data, and what this function looks like with v=4.

import numpy as np
import matplotlib.pyplot as plt
import pymc as pm

def f(v, x):
    return 1.0 - 2.0 ** (-10.0 ** (x + v))

status = np.array([1, 1, 1, 1, 1, 0, 0, 0])
x = np.array([0, -1, -2, -3, -4, -5, -6, -7])

x_plot = np.linspace(-8, 0)
plt.plot(x_plot, f(v=4, x=x_plot), label="f(v=4, x=x)")
plt.scatter(x, status, label="Data")
plt.ylabel("P(positive)")
plt.xlabel("x")
plt.legend()

output

Running this model, even with a prior bang on the expected value, gives 100% divergences:

with pm.Model():
    v = pm.Normal("v", 4.0, 1.0)
    pm.Bernoulli("lik", p=f(v=v, x=x), observed=status)
    pm.sample()

Presumably the issue comes from the fact that v is an exponent of an exponent. I can’t think of how to reparameterize this function. Does anyone have any ideas? Or is there anything else I can try?

Thanks in advance
David

I think you are just running into overflow issues. For example, if you clip p your model samples just fine:

import pytensor.tensor as pt

eps = 10e-6
with pm.Model() as model:
    v = pm.Normal("v", 4.0, 45.0)
    p = f(v=v, x=x)
    # clip p to a reasonable range
    p = pt.clip(f(v, x), eps, 1-eps)
    # save p into the inferenceData object
    p = pm.Deterministic("p", p)
    pm.Bernoulli("lik", p=p, observed=status)
    idata = pm.sample()

and produces:

import arviz as az

In [62]: az.summary(idata)[["mean", "sd", "r_hat"]]
Out[77]: 
       mean     sd  r_hat
p[0]  1.000  0.000    NaN
p[1]  1.000  0.000    NaN
p[2]  1.000  0.001    1.0
p[3]  0.984  0.061    1.0
p[4]  0.718  0.269    1.0
p[5]  0.205  0.182    1.0
p[6]  0.026  0.032    1.0
p[7]  0.003  0.003    1.0
v     4.344  0.484    1.0

Note that the posterior of credible values of v essentially prevents any useful variation in the values of p for the first couple of observations (presumably because of the “extreme” values of x), which is likely a symptom of the same issue.

I would expect someone smarter than me would be able to reparameterize this so that there is greater numerical stability as v and/or x grows.

To figure out if something tricky can be done, I suspect we’ll need to know more about the model and the math that leads up to the exp of exp term.

Can you define the log(p) instead of p and convert it to logit with something like logp - log1p(-logp)? The Bernoulli can be parametrized by a logit_p directly

1 Like

Thank you for the suggestions!

Clipping works well, and is a usable solution. I only care about the posterior of v, so lack of variation in p for the first few data is probably fine.


Although clipping essentially solves this, I’d be curious if anyone has suggestions for how this may be reparameterised. Here’s some more details on the model. The model is a Poisson single hit with rate:

r = log(2) \times e^{log(b) \times (v + x) }

The probability of positive data (y=1) is:

p=1 - e^{-r}

Substituting in r and simplifying gives: p = 1 - 2^{-b^{x+v}}, which is where f(x, v) came from (with b=10).

(The likelihood is y \sim \text{Bernoulli}(p) )


I had thought about trying to pass logit(p) to pm.Bernoulli, but given this equation for p, I don’t think a nicer form for log(p) is possible?

The log converts that first exponent to multiplication

Your issue is with this:

This will round to 1 in double precision floating point if x + v is greater than 1.8 or so. In double precision floating point, the largest number less than 1 is around 1-1e-16 and the smallest number greater than 0 is 1e-300 or so. There’s even less range in single-precision (as you get by default in JAX, for example).

So you want to flip things around and use just the -2**(-10**(x + v)) part. If PyMC doesn’t have the complementary parameterization, exploit the equality

\textrm{bernoulli}(y \mid p) = \textrm{bernoulli}(1 - y \mid 1- p).

But even that’s going to be a problem if you have to evaluate f—you want to give the Bernoulli the log probability or logit probability rather than the probability for stability.

2 Likes

Hi @ricardoV94 and @bob-carpenter, many thanks for your input. And apologies for the slow response, I’ve been away for a bit.

I think I’m still stuck. Even if define the likelihood in terms of the negative data, using:

p(y=0) = 2^{-10^{x+v}}

then, that does give a nice expression for log(p) =log(2)10^{x+v}

But to calculate logit(p) (to pass to pm.Bernoulli) I still need log(1-p) to plug into:

logit(p) = log(p) - log(1-p)

And I don’t think log(1-p) can’t be simplified to remove the exponent of an exponent…

The new GPT reasoning module is pretty good at this kind of thing. It’s suggesting this, where x is your x + v:

import numpy as np

def stable_log_ratio(x):
    a = np.power(10, x) * np.log(2)
    s = -a - np.log(-np.expm1(-a))  # Computes log(2^(-10^x) / (1 - 2^(-10^x))) stably
    return s

The other thing to try would be to use log1p(-p) in place of log(1 - p) in the expression as you have it.

You don’t need to worry about that sort of more trivial stuff with PyMC as the PyTensor backend does those sort of stabilization rewrites automatically. Writing log(1 - pt.exp(x)) will achieve the same down the road :wink:

Also just to be clear, users should use pytensor.tensor and not numpy operations for PyMC models. The same function names/ signatures that exist in numpy should exist in PyTensor

Anyways does the suggestion clear up things @davipatti ?

Thanks for the suggestions. Yes, using @bob-carpenter’s stable_log_ratio does sample without divergences.

(I’m now getting errors when I try to set up a hierarchical model for multiple samples, but I think this is unrelated…)

Many thanks!