How to sample efficiently from time series data?

I have the following time series model

S[t+1] = S[t] - B[t]
E[t+1] = E[t] +B[t] - C[t]
I[t+1] = I[t+1] + C[t] - D[t]

B[t] ~ Binom(S[t], beta)
C[t] ~ Binom(E[t], gamma)
D[t] ~ Binom(I[t], delta)
beta, gamma, delta ~ Beta(a, b)

Only C and D are observed and we know S[0], E[0], I[0], a and b. What is the best way to represent this model in PyMC3?

Currently I do the following:

with pm.Model() as model:
    s0, e0, i0 = 100, 50, 25
    C = np.array([3, 5, 8, 13, 21, 26, 10, 3])
    D = np.array([1, 2, 3, 7, 9, 11, 5, 1])
    beta = pm.Beta('beta', 2, 10)
    gamma = pm.Beta('rho', 2, 10)
    delta = pm.Beta('gamma', 2, 10) 
    
    B = []
    st, et, it = s0, e0, i0
    for t in range(len(C)):
        bt = pm.Binomial(f'b_{t}', st, beta)
        ct = pm.Binomial(f'c_{t}', et, gamma, observed=C[t])
        dt = pm.Binomial(f'd_{t}', it, delta, observed=D[t])       
        
        st = st - bt
        et = et + bt - ct
        it = it + ct - dt
        B.append(bt)
    step1 = pm.Metropolis(B)
    step2 = pm.NUTS([beta, rho, gamma])
    trace = pm.sample(1000, tune=1000, step=[step1, step2])

However, this creates one Metropolis sampler for each B[t] and does not scale well for a long time series. I would like to sample the entire B at once with one sampler. I also have a transition distribution for B in mind that might help the convergence.

How can I do it with PyMC3?

You need to write the for loop into a theano.scan for performance.

You can specify that in https://docs.pymc.io/api/inference.html?highlight=metropolis#module-pymc3.step_methods.metropolis with proposal_dist kwarg

Something along the line of:

s0, e0, i0 = 100., 50., 25.
st0, et0, it0 = [theano.shared(x) for x in [s0, e0, i0]]

C = np.array([3, 5, 8, 13, 21, 26, 10, 3], dtype=np.float64)
D = np.array([1, 2, 3, 7, 9, 11, 5, 1], dtype=np.float64)

def seir_one_step(st0, et0, it0, beta, gamma, delta):
    bt0 = st0 * beta
    ct0 = et0 * gamma
    dt0 = it0 * delta
    
    st1 = st0 - bt0
    et1 = et0 + bt0 - ct0
    it1 = it0 + ct0 - dt0
    return st1, et1, it1

with pm.Model() as model:
    beta = pm.Beta('beta', 2, 10)
    gamma = pm.Beta('rho', 2, 10)
    delta = pm.Beta('gamma', 2, 10)

    (st, et, it), updates = theano.scan(
        fn=seir_one_step,
        outputs_info=[st0, et0, it0],
        non_sequences=[beta, gamma, delta],
        n_steps=len(C))

    ct = pm.Binomial('c_t', et, gamma, observed=C)
    dt = pm.Binomial('d_t', it, delta, observed=D)

    trace = pm.sample()

with model:
    bt = pm.Binomial('b_t', st, beta, shape=len(C))
    ppc_trace = pm.sample_posterior_predictive(trace, var_names=['b_t'])

If you are interested in SEIR like model, Prediction of Danish Covid19 cases is a great example.

2 Likes

Thank you for your response! I have a few clarifying questions. If I understand correctly, the following code

def seir_one_step(st0, et0, it0, beta, gamma, delta):
bt0 = st0 * beta
ct0 = et0 * gamma
dt0 = it0 * delta

st1 = st0 - bt0
et1 = et0 + bt0 - ct0
it1 = it0 + ct0 - dt0
return st1, et1, it1

uses mean value of each B[t], C[t] and D[t] to initialize S, E and I for len(C) time steps. Prior to sampling we specify that C[t] and D[t] have binomial distribution with given observed data. However, we do not specify the distribution of B before the line

trace = pm.sample()

Here are my questions:

  1. We are not directly using B, C or D to update S, E, I. (the seir_one_step uses only mean values.) Then how does the observed data affect values of the parameters?
  2. How does the model know B[t] has binomial likelihood? And how does it impute B?

The information is connected through ct = pm.Binomial('c_t', et, gamma, observed=C), where the output variable et is produced by the theano.scan

It does not, but as you are not observing B[t] anyway, it doesnt really matters to have precise likelihood.

I know that I am late to this question and in addition I am asking only a remotely related question, but I am trying to do something similar like light_kira and the inner workings of tt.scan confuse me.

I have dummed down the above functionality to:

s0= 100.

C = np.array([3, 5, 8, 13, 21, 26, 10, 3], dtype=np.float64)
D = np.array([1, 2, 3, 7, 9, 11, 5, 1], dtype=np.float64)

def seir_one_step(st0):
st1 = st0 + tt.as_tensor_variable(1.0)
return st1

sk0 = [tt.dscalar() for x in [s0]]
st, updates = theano.scan(fn=seir_one_step, outputs_info=[sk0], n_steps=len©)

f = theano.function(inputs=[sk0], outputs=(st), updates=updates)
f(1.0)
array([[2.],
[3.],
[4.],
[5.],
[6.],
[7.],
[8.],
[9.]])

But this outputs a 2d tensor.

The only way I seem to be able to generate a 1d tensor is by introducing a dummy second variable like this:

s0= 100.
st0 = [theano.shared(x) for x in [s0]]

C = np.array([3, 5, 8, 13, 21, 26, 10, 3], dtype=np.float64)
D = np.array([1, 2, 3, 7, 9, 11, 5, 1], dtype=np.float64)

def seir_one_step(st0, dt0):
st1 = st0 + tt.as_tensor_variable(1.0)
return st1, tt.as_tensor_variable(np.array(1.0, dtype=np.float64))

sk0,dk0 = [tt.dscalar() for x in [s0, 1.0]]
(st,_), updates = theano.scan(fn=seir_one_step, outputs_info=[sk0, dk0], n_steps=len©)

f = theano.function(inputs=[sk0, dk0], outputs=(st,), updates=updates)
f(1.0, 1.0)
[array([2., 3., 4., 5., 6., 7., 8., 9.])]

How can I get a 1d tensor as output without having to resort to a dummy second variable?