Multivariate time series inference

This is probably what you’re looking for. In the econometrics literature, what you describe (multiple variables [including population], sequential data) is typically called ‘panel data’. There’s a pymc3-specific hit for a google search here. In general the model is given an additional index (over time), allowing for lag-dependences, such as (for example, individual i at time t):

result_{i,t} ~ task_offset_{i} + population_offset_{i,t} + lag1_effect * result_{i, t-1} + lag2_effect * result_{i, t-2} + ... + error_{i, t}.

This either means modeling the response as a matrix (not all that standard), or unraveling into a very long vector (more standard).

3 Likes