Poisson Model with Random Walk Parameters

jmharkins · June 29, 2018, 7:24pm

Hey there,

I want to build a model for data where each row has a pair of individuals who control two count values, one for each individual, that come from a Poisson process. Furthermore, each individual has two parameters that control these values. The Poisson distribution is parameterized such that:

X_a ~ Poisson(theta_a)
X_b ~ Poisson(theta_b)

with

log(theta_a) = add_a - sub_b
log(theta_b) = add_b - sub_a

That is, each individual has a true “add” rate and “sub(tract)” rate that contributes to their own count value and that of their paired individual. In this example, I’d like these parameters to vary over time according to a random walk. Thus, the add/subtract rates will be unique to each combination of (individual, time interval).

I tried to implement this on some mock data using the following design in PyMC3:

import pymc3 as pm
import theano.tensor as tt

...

with pm.Model() as model:
    intercept = pm.Normal('intercept', mu=0, sd=1, testval=0.25)
    adds_star = pm.GaussianRandomWalk("adds_star",
                          sd=0.001,
                          shape=(n_intervals, n_individuals),
                          testval=add_estimate)
    subs_star = pm.GaussianRandomWalk("subs_star",
                          sd=0.001,
                          shape=(n_intervals, n_individuals),
                          testval=sub_estimate)
    
    # normalize parameters -sum to zero constraint
    adds_est = pm.Deterministic("adds", adds_star - tt.tile(adds_star.mean(axis=1), (adds_star.shape[1], 1)).T)
    subs_est = pm.Deterministic("subs", subs_star - tt.tile(subs_star.mean(axis=1), (subs_star.shape[1], 1)).T)

    a_theta = tt.exp(intercept + 
                     adds_est[interval_a, id_a] + 
                     subs_est[interval_b, id_b])
    b_theta = tt.exp(intercept + 
                     adds_est[interval_b, id_b] + 
                     subs_est[interval_a, id_a])

    a_counts = pm.Poisson('a_counts',
                            mu=a_theta,
                            observed=obs_value_a)
    b_counts = pm.Poisson('b_counts',
                            mu=b_theta,
                            observed=obs_value_b)

Where interval_a, interval_b, id_a, id_b, obs_value_a, obs_value_b are numpy arrays that hold the individual IDs and time interval IDs for each row, and the observed counts for each individual.

I tried making this model even with some mock data that should match the priors relatively closely but I have the following issues with inference:

When I run the model with NUTS I get this “bad energy” error like in some other issues
I tried changing the init argument but the best I could do there was sampling which spent a really long time sampling and did not ultimately converge, I don’t think.
When I use a Metropolis sampler the model runs rather quickly but is unable to achieve estimates that look much like the “true” mock rates for each individual that I generated.

My question is: is my model obviously misspecified somewhere? Does someone have a good resource for a model like this where a Poisson parameter changes over time? I’d also appreciate more general tips on models involving Random Walk parameters - I’ve read Thomas’ blog post here and tried to follow a somewhat similar example.

Thanks,
Johannes

P.S. Let me know if this question is poorly written or the problem poorly stated and I can try to upload my notebook for anyone to check out/run.

junpenglao · June 30, 2018, 9:15am

You are not doing subtraction here?

Yeah I think a notebook would help - I would like to see how you generate the data etc.

jmharkins · July 9, 2018, 2:28am

Hi Junpeng,

Thanks for taking a look. I asked this question right before going into spotty internet access for a week, so I had to leave it hanging for a while, but if you’d still be willing to have a look at the notebook, I’d really appreciate it! You’re right that I set up the subtraction/addition backwards in the question but it was simpler for me to conceive of the whole thing using addition, so I just left it that way in the notebook (it shouldn’t matter, as long as the usage is consistent). The only upshot is that the interpretation of the parameters is opposite, so an individual with a positive subtraction parameter contributes negatively to the overall theta of their partner in a pair, as opposed to positively.

junpenglao · July 10, 2018, 5:45am

So running your notebook, here are my thoughts/debugging below:

Metropolis, bad idea for complex model
Use default sampling (with NUTS), get gradient error
Check prior, sd in GaussianRandomWalk is likely too small, change to sd=1
Sample seems to go well (ie it ran without returning error), but get The gelman-rubin statistic is larger than 1.4 for some parameters. warning.
check trace for all parameters, the normalizing in

adds_est = pm.Deterministic("adds", adds_star - tt.tile(adds_star.mean(axis=1), (adds_star.shape[1], 1)).T)
subs_est = pm.Deterministic("subs", subs_star - tt.tile(subs_star.mean(axis=1), (subs_star.shape[1], 1)).T)

makes the model unidentifiable, commented it out and modify the code, reran model.

still got warning, check trace and model still unidentifiable because of the subtraction/addition. Indeed the model is still over-specify (there are more parameters than needed).

I am not sure what is the best approach here, maybe check out how others model similar problem:

jmharkins · July 10, 2018, 8:54pm

Thanks for pointers, Junpeng. I’ll look into them and let you know what I turn up.

FWIW, I thought that the lines where I subtract the interval-specific means were what made the model identifiable, rather than unidentifiable. The add/subtract parameters I think either need to be mean-zero, or else pegged to a “baseline” individual’s rate.

Topic		Replies	Views
pm.Poisson behaves unexpectedly inside model context Questions	2	400	January 1, 2022
How to model time-dependent variables in pymc3 Questions	5	4049	November 6, 2017
Poisson Binomial in PyMC3 Questions	5	799	February 13, 2022
Generalized Poisson distribution Development	1	590	January 12, 2021
Bivariate Poisson v5 modeling	4	904	December 15, 2022

Poisson Model with Random Walk Parameters

Related topics