Hey there,
I want to build a model for data where each row has a pair of individuals who control two count values, one for each individual, that come from a Poisson process. Furthermore, each individual has two parameters that control these values. The Poisson distribution is parameterized such that:
X_a ~ Poisson(theta_a)
X_b ~ Poisson(theta_b)
with
log(theta_a) = add_a - sub_b
log(theta_b) = add_b - sub_a
That is, each individual has a true “add” rate and “sub(tract)” rate that contributes to their own count value and that of their paired individual. In this example, I’d like these parameters to vary over time according to a random walk. Thus, the add/subtract rates will be unique to each combination of (individual, time interval).
I tried to implement this on some mock data using the following design in PyMC3:
import pymc3 as pm
import theano.tensor as tt
...
with pm.Model() as model:
intercept = pm.Normal('intercept', mu=0, sd=1, testval=0.25)
adds_star = pm.GaussianRandomWalk("adds_star",
sd=0.001,
shape=(n_intervals, n_individuals),
testval=add_estimate)
subs_star = pm.GaussianRandomWalk("subs_star",
sd=0.001,
shape=(n_intervals, n_individuals),
testval=sub_estimate)
# normalize parameters -sum to zero constraint
adds_est = pm.Deterministic("adds", adds_star - tt.tile(adds_star.mean(axis=1), (adds_star.shape[1], 1)).T)
subs_est = pm.Deterministic("subs", subs_star - tt.tile(subs_star.mean(axis=1), (subs_star.shape[1], 1)).T)
a_theta = tt.exp(intercept +
adds_est[interval_a, id_a] +
subs_est[interval_b, id_b])
b_theta = tt.exp(intercept +
adds_est[interval_b, id_b] +
subs_est[interval_a, id_a])
a_counts = pm.Poisson('a_counts',
mu=a_theta,
observed=obs_value_a)
b_counts = pm.Poisson('b_counts',
mu=b_theta,
observed=obs_value_b)
Where interval_a
, interval_b
, id_a
, id_b
, obs_value_a
, obs_value_b
are numpy arrays that hold the individual IDs and time interval IDs for each row, and the observed counts for each individual.
I tried making this model even with some mock data that should match the priors relatively closely but I have the following issues with inference:
- When I run the model with NUTS I get this “bad energy” error like in some other issues
- I tried changing the
init
argument but the best I could do there was sampling which spent a really long time sampling and did not ultimately converge, I don’t think. - When I use a Metropolis sampler the model runs rather quickly but is unable to achieve estimates that look much like the “true” mock rates for each individual that I generated.
My question is: is my model obviously misspecified somewhere? Does someone have a good resource for a model like this where a Poisson parameter changes over time? I’d also appreciate more general tips on models involving Random Walk parameters - I’ve read Thomas’ blog post here and tried to follow a somewhat similar example.
Thanks,
Johannes
P.S. Let me know if this question is poorly written or the problem poorly stated and I can try to upload my notebook for anyone to check out/run.