Two questions about count models


My modeling problem is as follows: an athlete plays in a game during which they can take actions that benefit their team, for which they receive points P, or make mistakes for which they receive demerits N. The quantity I am interested in modeling is X = P - N.


It makes more sense to me to model P and N individually as say for example negative binomials, and then sample from X by defining it to be a deterministic variable equal to P - N, but I was wondering if there is a way in PyMC to then attach a likelihood to X. The reason I am asking is that I already have several models that attempt to model X directly, and I want to be able to do model comparison, but it doesn’t seem like you can feed observed data to a deterministic variable.


Are there any standard references/examples/tutorials for dealing with count data with unequal observation times? I have a ton of very influential observations that are generated when a player participates in a game for an extremely short period of time, but either scores a point or makes a mistake, which leads to an unrealistically high estimate for the rate at which they score/err. I suppose I can drop all observations from very short time periods, but that doesn’t feel very Bayesian.

Thanks in advance!


Are P and N observed? Is X observed? If P and N are unobserved but X is observed, then you are in a difficult spot. See this old, but still relevant thread. Fortunately, the SMC/simulator interface has come a long way since that thread if you are interested in using it.