Hi
Image I have a process that generates following type of data.
Number of counts observed within a time interval \Delta t, see Figure below.
I generate the data from an exponential RV with constant \lambda_1 = 0.1, then create a histogram over \Delta t = 1 and save the counts per bin --> counts form component 1. Additionally, there is a “background/bgd” signal on each bin that is poisson distributed with \lambda_{bgd} = 5. The observed signal is the sum of component1 + background.
Now, I want to infer \lambda_1 and \lambda_{bgd}. I trie the following. (After a suggestion here, anser 2 or 3 from “fonnesbeck”: https://stats.stackexchange.com/questions/56543/pymc-how-can-i-define-a-function-of-two-stochastic-variables-with-no-closed-fo?rq=1)
import theano.tensor as tt
# counts_total: denotes the generated observations,
i.e. the black dots in the figure above.
with pm.Model() as model:
lambda1 = pm.HalfNormal('lambda1',0.15)
lambdabgd = pm.DiscreteUniform('lambdabgd',3,6)
p=pm.Deterministic('p',tt.exp(-lambda1*(t[:-1]+0.5))*(tt.exp(0.5*lambda1)-tt.exp(-0.5*lambda1)))
cts_comp1 = pm.Binomial('b',n=1000,p=p,shape=len(counts_total))
bgd_counts = pm.Poisson('bgd_counts',lambdabgd,observed=np.abs(counts_total-cts_comp1))
trace = pm.sample(500,tune=100)
So, the logic I was thinking; choose a \lambda_1(continuous), choose a \lambda_2 (discrete), calculate the probability for an observation in bin t_i (exponentially decaying).
Then, express the (latent) bgd-counts as a poisson RV with value as difference between observation and the stochastic RV(vector) associated with component1.
But, It can happen that the difference “observation - stochastic. component1” gets negative… Then pymc3 complains, I assume no logp associated with a negative observation for poisson…
(here i smuggled in an np.absolute… to make it run)
Looking at the results I can see that the posterior for \lambda_1 appears to be moving towards the right direction, also \lambda_bgd has concentrated, however not at 5 as expected but 6, that could be due to the “np.absolute” i guess… So, it seems to be doing sth. approximately right… but I am not sure If this is the way to go…
(note, only 200 samples, for some reason I am sampling with 2 draws/s)
Can anyone hint me, how this could be done cleaner? How can/should I treat this kind of problem?
Best regards