I am struggling with the following problem. I am looking at a set of Poisson-distributed data from which the mean value arises from a log-normal distribution. To be more explicit: I am observing a phenomenon that should arise from a log-normal distribution, but the measurement method is a Poisson process.
I would like to constrain the parameters of the log-normal distribution (mean and standard deviation). It should be possible to differentiate the observed distribution from a Poisson process arising from a single mean value: in the plot below I show an example of histogram of values from a simulated constant Poisson (orange) and a Poisson process with a log-normal underlying distribution:
To implement this problem in PyMC3 I proceeded in the following way:
model = pm.Model()
with model:
sd = pm.HalfCauchy('sd', beta=2.0)
mlog = pm.Normal('med', mu=np.log(meanv), sd=10.)
pred = pm.Lognormal('lognorm', mu=mlog, sd=sd)
obs = pm.Poisson('obs', mu=pred, observed=realiz)
trace = pm.sample(1000)
However when proceeding in this way it looks like the ‘mlog’ and ‘sd’ variables are ignored and I am just sampling the value of ‘lognorm’ as if it were the single expectation value of the Poisson process. How should I proceed?
Your plot regarding a log-normal / Poisson process makes sense in that we see extra variance as compared to a Poisson with fixed mean parameter. I am a little unsure as to the meaning of this statement:
Do you mean to say that you expect to find extra posterior variance in lognorm? Or do you mean to say that the samples for lognorm are all identical, with zero variation?
Thanks for your answer. Indeed, there is extra variance in the data, as you point out, which means that it should be possible to fit for the mean and standard deviations of the log-normal distribution, the ‘mlog’ and ‘sd’ variables above.
What I mean is that the value of ‘lognorm’ fits to the mean value of the Poisson process, and nothing happens to the variables that I am really interested in, i.e. ‘mlog’ and ‘sd’. Their posterior is exactly the same as their prior, so it is as if they are completely ignored.
How would you code the problem that I am referring to?
I think you need to use the shape parameter in Lognormal. Currently you are fitting just a single Poisson process to your observed data realiz. I think you want to draw the poisson lambda from a LogNormal distribution and then for each draw a single Poisson realisation.
with pm.Model() as model:
sd = pm.HalfCauchy('sd', beta=2.0)
mlog = pm.Normal('mlog', mu=np.log(X.mean()), sd=10.)
pred = pm.Lognormal('pred', mu=mlog, sd=sd, shape=N)
obs = pm.Poisson('obs', mu=pred, observed=X)
trace = pm.sample(1000, return_inferencedata=True)