Reconciliation via Poisson Sampling

Cristianobam · December 21, 2023, 5:36am

Hello, everyone.

I was trying to run some models to perform a kind of reconciliation over multiple levels. The idea is to have a base level (a.k.a, lvl3) which is Poisson distributed with a given mean (lvl3). Later, this will be joined using the S matrix to generate the observed data of the following level. The idea came out of from this paper https://arxiv.org/pdf/2207.09322.pdf. The implementation follows:

lvl3 = np.array([0,1,1,0,1,1,2,1])
lvl2 = np.array([1,1,2,3])
lvl1 = np.array([4,6])
lvl0 = np.array([10])

S_2 = np.array([[1,1,0,0,0,0,0,0],[0,0,1,1,0,0,0,0],[0,0,0,0,1,1,0,0],[0,0,0,0,0,0,1,1]])
S_1 = np.array([[1,1,0,0],[0,0,1,1]])
S_0 = np.array([1,1])

with pm.Model() as model:
    l3 = pm.Poisson('l3', mu=lvl3, shape=lvl3.shape)
    l2 = pm.Poisson('l2', mu=lvl2, shape=lvl2.shape, observed=S_2@l3)
    l1 = pm.Poisson('l1', mu=lvl1, shape=lvl1.shape, observed=S_1@l2)
    Y = pm.Poisson('Y', mu=lvl0[0], observed=S_0@l1)
    trace = pm.sample(draws=5000, tune=3000)

>>> trace.posterior.l3.mean(['chain', 'draw']).values
# array([0.     , 1.0997 , 1.078  , 0.     , 0.9671 , 1.0333 , 2.16785, 1.0635 ])

Even thought I can sample (and the results are fairly reasonable) it takes ages when we scale the base level to one thousand items, for example. So, my first question would be: Is there a way to sample discrete models faster? I noticed that external samplers do not support discrete models.

A second approach that I tried was to use the following model:

lvl3 = np.array([0,1,1,0,1,1,2,1])
lvl2 = np.array([1,1,2,3])
lvl1 = np.array([4,6])
lvl0 = np.array([10])

S_2 = np.array([[1,1,0,0,0,0,0,0],[0,0,1,1,0,0,0,0],[0,0,0,0,1,1,0,0],[0,0,0,0,0,0,1,1]])
S_1 = np.array([[1,1,0,0],[0,0,1,1]])
S_0 = np.array([1,1])
                   
with pm.Model() as model:
    l3 = pm.Poisson("l3", mu=lvl3, shape=lvl3.shape)
    l2 = pm.Potential("l2", pm.logp(pm.Poisson.dist(mu=lvl2, shape=lvl2.shape), S_2@l3))
    l1 = pm.Potential("l1", pm.logp(pm.Poisson.dist(mu=lvl1, shape=lvl1.shape), S_1@l2))
    Y = pm.Potential("Y", pm.logp(pm.Poisson.dist(mu=lvl0[0]), S_0@l1))    
    
    posterior = pm.sample(draws=5000, tune=3000)

I tried to use an exponential distribution instead of a Poisson distribution on l3 (mainly to check the speed), it became almost immediate using numpyro parallel, however the results were odd. To check if there was some caveat with this implementation using pm.Potential, I tried running only with Poisson and I was unable to run. The error follows:

SamplingError: Initial evaluation of model at starting point failed!
Starting values:
{'l3': array([0, 1, 1, 0, 1, 1, 2, 1])}

Logp initial evaluation results:
{'l3': -6.31, 'l2': -4.8, 'l1': -inf, 'Y': -inf}
You can call `model.debug()` for more details.

Then, my second question: Is there an alternative approach that I could use that speed up the sampling time but with some reasonable outcome?

ricardoV94 · December 21, 2023, 7:07am

Your first model shouldn’t manage to run because you have RVs in the observations which PyMC doesn’t allow. What version of PyMC are you using?

If it’s a recent one I suspect data @ rv is returning some weird constant numpy array (without ever invoking PyTensor) and the results you got were only accidentally good. You can print the result of that operation to confirm.

Calling pt.matmul(data, rv) should correctly cast data to a tensor and then PyMC will error out when you try to use it in observed

Cristianobam · December 21, 2023, 11:32am

Hello, Ricardo.

>>> pm.__version__
# '5.10.2'

I changed as you proposed but it is still a up and running example.

>>> import pytensor.tensor as pt
>>> pt.matmul(S_2, l3)
# DropDims{axis=1}.0

And for the complete application

lvl3 = np.array([0,1,1,0,1,1,2,1])
lvl2 = np.array([1,1,2,3])
lvl1 = np.array([4,6])
lvl0 = np.array([10])

S_2 = np.array([[1,1,0,0,0,0,0,0],[0,0,1,1,0,0,0,0],[0,0,0,0,1,1,0,0],[0,0,0,0,0,0,1,1]])
S_1 = np.array([[1,1,0,0],[0,0,1,1]])
S_0 = np.array([1,1])

with pm.Model() as model:
    l3 = pm.Poisson('l3', mu=lvl3, shape=lvl3.shape)
    l2 = pm.Poisson('l2', mu=lvl2, shape=lvl2.shape, observed=pt.matmul(S_2, l3))
    l1 = pm.Poisson('l1', mu=lvl1, shape=lvl1.shape, observed=pt.matmul(S_1, l2))
    Y = pm.Poisson('Y', mu=lvl0[0], observed=pt.matmul(S_0, l1))
    trace = pm.sample(draws=5000, tune=3000)

trace.posterior.l3.mean(['chain', 'draw']).values
# array([0.     , 1.06295, 1.10495, 0.     , 1.0249 , 1.0223 , 2.1539 , 1.0237 ])

ricardoV94 · December 21, 2023, 11:43am

Seems like our check is dumb and in this case there is a cast from float to integer (since Poisson is discrete, but matmul returns a float), and PyMC thinks that’s fine. If you replace your Poisson by Normal you will see the error raised

ricardoV94 · December 21, 2023, 11:46am

In your case the logp seems to come out fine though.

I would still suggest you go with model 2 that uses Potential, since you are directly interfering with the logp component of the model. We may batch the observed in the future.

Cristianobam · December 21, 2023, 11:54am

How should I fix the problem with the second implementation? Going with the second implementation I can’t run…

ricardoV94 · December 21, 2023, 11:55am

Speed-wise nothing obvious comes to mind, but the l3 with mu=0 is not well behaved, since the only value it can generate is 0. That means you can get rid of those?

ricardoV94 · December 21, 2023, 11:58am

Your problem with the Potential model is that you are using l2 and l1 coming from the Potential but those are just the logps, not the values like you had before. You want to use l2 = S2 @ l3 in what you called l1 Potential

ricardoV94 · December 21, 2023, 12:01pm

with pm.Model() as model:
    l3 = pm.Poisson("l3", mu=lvl3, shape=lvl3.shape)
    l2 = pm.Deterministic("l2", S_2 @ l3)
    l1 = pm.Deterministic("l1", S_1 @ l2)
    l0 = pm.Deterministic("l0", S_0 @ l1)
    l2_pot = pm.Potential("l2_pot", pm.logp(pm.Poisson.dist(mu=lvl2, shape=lvl2.shape), l2))
    l1_pot = pm.Potential("l1_pot", pm.logp(pm.Poisson.dist(mu=lvl1, shape=lvl1.shape), l1))
    l0_pot = pm.Potential("l0_pot", pm.logp(pm.Poisson.dist(mu=lvl0[0]), l0))
    
model.point_logps()  # no -inf should show up here

Cristianobam · December 21, 2023, 12:10pm

Works like a charm!! Thanks a lot Ricardo.

One last question . How can I speed up this sampling? I know that discrete models cannot be sampled using numpyro for example, but it takes ages with hundreds/thousand of data points. Would I need to change it to some continuous model to benefit from faster sampling methods?

ricardoV94 · December 21, 2023, 12:19pm

I would suggest getting rid of the zeros. If you have many in your larger dataset that reduces the number of parameters considerably. You know their posterior is exactly zero anyway.

Otherwise, you may need to implement a better sampler for the Poisson variable. The Metropolis sampler doesn’t even know Poisson can’t have negative values so it will be proposing useless values many times. This is however something I would only suggest as a last resource.

Otherwise, discrete variables are just trickier to sample, I’m afraid you just have to throw compute at it. You can try to use the JAX backend for the PyMC sampler in case you have a GPU. That could be faster.

You can do that by wrapping your call to pm.sample like this:

with pytensor.config.change_flags(mode="JAX"):
  pm.sample()

However I am not sure if that will work with multiple chains in parallel, so you may need to revert to a single chain.

Topic		Replies	Views
Pb with Poisson & Gamma Questions	1	894	June 8, 2021
Multi-Multilevel Modeling Questions	9	2771	February 15, 2023
Missing data: np.nan in a multidimensional Poisson distribution v5	6	889	September 21, 2022
Problems sampling a Poisson single hit model	10	73	September 27, 2024
Sampling from nested models Questions	10	1140	May 27, 2021

Reconciliation via Poisson Sampling

Related topics