Infering exponential decay times


Image I have a process that generates following type of data.
Number of counts observed within a time interval \Delta t, see Figure below.

I generate the data from an exponential RV with constant \lambda_1 = 0.1, then create a histogram over \Delta t = 1 and save the counts per bin --> counts form component 1. Additionally, there is a “background/bgd” signal on each bin that is poisson distributed with \lambda_{bgd} = 5. The observed signal is the sum of component1 + background.

Now, I want to infer \lambda_1 and \lambda_{bgd}. I trie the following. (After a suggestion here, anser 2 or 3 from “fonnesbeck”:

import theano.tensor as tt
# counts_total: denotes the generated observations,
   i.e. the black dots in the figure above.

with pm.Model() as model:
    lambda1 = pm.HalfNormal('lambda1',0.15)
    lambdabgd = pm.DiscreteUniform('lambdabgd',3,6)
    cts_comp1 = pm.Binomial('b',n=1000,p=p,shape=len(counts_total))

    bgd_counts = pm.Poisson('bgd_counts',lambdabgd,observed=np.abs(counts_total-cts_comp1))
    trace = pm.sample(500,tune=100)

So, the logic I was thinking; choose a \lambda_1(continuous), choose a \lambda_2 (discrete), calculate the probability for an observation in bin t_i (exponentially decaying).

Then, express the (latent) bgd-counts as a poisson RV with value as difference between observation and the stochastic RV(vector) associated with component1.

But, It can happen that the difference “observation - stochastic. component1” gets negative… Then pymc3 complains, I assume no logp associated with a negative observation for poisson…
(here i smuggled in an np.absolute… to make it run)

Looking at the results I can see that the posterior for \lambda_1 appears to be moving towards the right direction, also \lambda_bgd has concentrated, however not at 5 as expected but 6, that could be due to the “np.absolute” i guess… So, it seems to be doing sth. approximately right… but I am not sure If this is the way to go…

(note, only 200 samples, for some reason I am sampling with 2 draws/s)


Can anyone hint me, how this could be done cleaner? How can/should I treat this kind of problem?

Best regards

Maybe instead of observing the difference between two components, you could model the total observed counts as Poisson, with a lambda that is the sum of two components, one that is constant over delay (bgd) and one that changes exponentially over time?

Couple things:

First, do you have a good reason for forcing your background signal to have integer rate? That seems unlikely, particularly if this is just instrument noise. Besides, I believe you’ll in general do better with continuous distributions that discrete ones unless you have no other choice.

I’m a little confused with what you’re doing with p, why is a Binomial coming into play, and why all the 0.5s?

I think you’re also conflating the exponential decay rate with the Poisson rate at each time point. Unless I’m very mistaken, your histogram method is the wrong way to synthesize such data: the distribution of counts over a certain interval is not the same as the number of counts over each of successive intervals (just think, your x-axis units are different).

Further to what @bdyetton said, see this paper, particularly Eq 31 (page 8):

Following that equation, I believe your model should be:

with pm.Model() as model:
    Γ = pm.HalfNormal('Γ',1) # Rate of exponential decay for main Poisson rate
    λ = pm.HalfNormal('λ',10) # Background Poisson rate
    N = pm.HalfNormal('N',1000) # Scale factor / initial Poisson rate

    μ = N*Γ*pm.math.exp(-Γ*t)+λ # Observed Poisson (count) rate as a function of time
    counts = pm.Poisson('counts', mu=μ, observed = counts_total)

Note that you should use a similar approach to generating the simulated data against which to check your code. Also, that paper includes an ε, the probability of detection, which I don’t include here. If you don’t have very tight a priori estimates or bounds on N and ε, including both in your model would render them un-identifiable and result in very large uncertainties. Putting the 100 multiplier is just so that the Half-Normal can be specified

(P.S., I haven’t actually run that code, so caveat emptor. In particular, those prior means may be too weakly informative, and with means substantially far from 1 you may run into divergences or other sampling issues so you may wish to re-scale them.)