Filtering (e.g. particle filter, sequential MC)

Hello folks,

How could I do standard state-space filtering with pyMC3?

E.g. say I have a state-space model with transition equation for hidden state $x_t=g(x_{t-1}) + eta$ and observation equation $y_t=f(x_t)+eps$.

I use pyMC3 to estimate the model’s parameters using data $y_1,…,y_T$ but then how do I apply the estimated model in “online” manner?

I.e. when I get a new observation $y_{T+1}$ – how do I estimate the filtering distribution $p(x_{T+1} | y_1,…,y_{T+1})$ of the next state $x_{T+1}$? Without just rerunning the whole MCMC with an extra one datapoint added (which will be slow).

Thanks a lot in advance!

1 Like

So you would like to build a particle filter with a PyMC3 model within each step for parameter updates?

I was thinking about a similar problem awhile ago. I think you can do it this way:
1, Build you model and estimate the posterior using data $y_1,…,y_T$ (which you already did).
2, Get the trace from above, that’s all your particles (each with a set of parameters).
3, Build a new model exactly as above without the observed.
for example, if before you have:

with pm.Model() as model1:
    yobs = pm.Normal('obs', ..., observed=y_t)

the new model should be:

with pm.Model() as model2:
    #same as above
    #without setting the observed
    yobs = pm.Normal('obs', ...)

4, for any new data $y_k$, perform filtering:

for idx in range(trace_lenght):
    param = trace[idx]
    # plug in the new observation
    param['obs'] = y_k
    # evaluate the log-likelihood using the new model
    loglike[idx] = model2.logp(point=param)

# update particles using loglike, save it in new trace

@hvasbath programmed the SMC sampler in PyMC3, so it would be great for him to weight in on this as well.

1 Like

Thanks a lot, Junpeng for suggestion!

Let me think about your suggestion in a few hours when I’ll get back home, meanwhile, I’d like to clarify my question a bit:

My state-space model has 1) parameters theta and 2) unobserved states x_t.

I could estimate the parameters \theta using maximum likelihood using data y1,…,yT, but I want to try to do it by MCMC.
After I estimated the parameters \theta (have their posterior distribution from pyMC3), I’d like to use the estimated model to do filtering – estimate the unobserved states x_{T+1}, x_{T+2}, etc, sequentially obtaining observations y_{T+1}, y_{T+2}, …
Often this is done by the sequential monte-carlo aka particle filter.

A related problem – say I don’t want to estimate the parameters \theta, I just want to fix them at some values and then I want to do filtering for hidden states x1,…,xT to see how the result looks like. Again, this is done by sequential monte-carlo.
What MCMC would give me in this case is distribution of each x_t conditional on all the data y1,…,yT.
But what filtering should give is the distribution of each x_t conditional on only the preceding data y_1,…,y_t.

Thanks for the clarification, I think what I described above should work. Essentially, the PyMC3 model you build under the pm.Model() context is a function you can evaluate, and the MCMC gives you a posterior distribution in the form of samples (i.e., the trace). So if you took the trace and updated it using new data manually using resampling, it should work.

And to fixed some parameters and perform SMC, you can set those parameters as theano.shared values, and used set_value to evaluate the function.

Finally, to estimate on the preceding data, your model should have the possibility to take new input (y_t-1 to predict y_t if it is Markov Process) and then evaluate. Again, setting some parameter as theano.shared variable should do the job. You can have a look at some sample_ppc example on linear model with does the same thing.

Would you also be able to do this sort of online updating with one of the variational methods? I saw that sample_ppc now returns a “pseudo-trace” for ADVI, and was wondering if I would be able to use VI instead of MCMC for state space models.

In principle, it should be possible. However, I am not sure it is practical in PyMC3. Currently, PyMC3 create an Approximation class using the user specified model (@ferrine correct me if I am wrong here), and you then used an optimizer to estimate the parameter (SGD for example). Sample_ppc is just to generate “pseudo-trace” so that we can use the same traceplot etc. But for a filtering problem, it is easier to take the approximation object as a function and evaluated/update the parameters. I am not sure how to do it exactly, but I am planning to write a notebook to see if my idea above would work, maybe @ferrine can chip in when we have a concrete model.

@junpenglao @ferrine Let me know if there’s any way I can contribute. I’m working right now on implementing sequential updating of the Bayesian Gaussian mixture model for scikit-learn, but it’s limited to mean-field inference and written entirely in pure Python so pretty slow. I’d love to be able to use pymc3 instead for general variational filtering instead if you guys figure it out.

For a first glance you need AEVB here with sequential model obtaining p(x_t+1|y_1:t,x_1:t). I think that is the way to go for Variational methods. @taku-y can correct me if I’m wrong

It would be easier once your implementation is ready. I am very interested in what you are doing - I used the BayesianGMM in scikit-learn before but it is not a real Bayesian Inference…

Hi, as a newcomer, I’m still a little bit confused about “So if you took the trace and updated it using new data manually using resampling, it should work.”. Could you please explain more about this?

By the way, I think my question here is equivalent to how to set a prior with a former sampling result (i.e. a trace variable). Maybe this can help clarify my confusion.:slightly_smiling_face:

I am very interested in this discussion, since SMC seems like the best way to do time-series analysis with Bayesian methods (something that no Bayesian library does well at the moment, afaik).

Have there been any updates on this? I would like to help out on this topic, since time-series fascinate me. SMC seems to have been implemented by now (as junpenglao implied), but none of the time-series examples seem to use it for some reason.

It would be great to test on time-series model. However, IIUC SMC inference on time series is usually done different than the SMC framework here, as people usually do one step SMC update. But in pymc3 the SMC is inferencing the model conditioned on all time steps being observed.

In that case, which pieces of the current implemention would be most useful to grab when fashioning it for time-series? Which pieces would you recommend me to look at in the codebase, junpenglao?

I have been thinking that you probably can run normal inference (using NUTS etc), make one step prediction, use the prediction to compute the weighting, and do resample similar to:


I’m tryng to model a particle filter using pymc3. Initially I build a model using pm.Model() command. According to my knowledge, after building the model, I need to compute the weights of posterior in such a way that those with high weights(ie, samples near the model) should be retained and others need to be discarded. Is there any way to compute the weights(probabilty) of the posterior so that resampling can be done after that.
Please excuse me if I’m wrong.

Thanks in advance.Please help me to get through this

Can we use pm.sample_smc() for doing this whole portion

Are you working with a time series? If so pm.sample_smc() is probably not what you want as that’s for static models.

Feel free to open a new thread that better describe your problem.

Thank you for fast reply.
I’m trying to design a time series prediction model.
According to my understanding,

with pm.Model() as model_particle:
x = pm.Gamma(‘x’, 1, 10) #prior
y = pm.Poisson(‘y’, x, observed=data) #likelihood
trace = pm.sample(1000) #posterior computation

this part will compute the posterior calculation. I tink I’m correct till this.
After this I need to resample the data based on weights(particle filter), the one with more weights should be retained and others need to be discared. the one with more weights should be reproduced and again sampling has to be done.
I’m actually stuck in the above explained portion.I’m confused in calculating the weights and resampling.Can you please help me.
Actually I felt like this thread is related to particle filter,So I replied in this.Sorry for again trying to clarify my doubt here.Please excuse me.

Thanks in advance