Filtering (e.g. particle filter, sequential MC)

kochede · June 24, 2017, 11:01am

Hello folks,

How could I do standard state-space filtering with pyMC3?

E.g. say I have a state-space model with transition equation for hidden state $x_t=g(x_{t-1}) + eta$ and observation equation $y_t=f(x_t)+eps$.

I use pyMC3 to estimate the model’s parameters using data $y_1,…,y_T$ but then how do I apply the estimated model in “online” manner?

I.e. when I get a new observation $y_{T+1}$ – how do I estimate the filtering distribution $p(x_{T+1} | y_1,…,y_{T+1})$ of the next state $x_{T+1}$? Without just rerunning the whole MCMC with an extra one datapoint added (which will be slow).

Thanks a lot in advance!

junpenglao · June 24, 2017, 12:06pm

So you would like to build a particle filter with a PyMC3 model within each step for parameter updates?

I was thinking about a similar problem awhile ago. I think you can do it this way:
1, Build you model and estimate the posterior using data $y_1,…,y_T$ (which you already did).
2, Get the trace from above, that’s all your particles (each with a set of parameters).
3, Build a new model exactly as above without the observed.
for example, if before you have:

with pm.Model() as model1:
    ....
    yobs = pm.Normal('obs', ..., observed=y_t)

the new model should be:

with pm.Model() as model2:
    #same as above
    ....
    #without setting the observed
    yobs = pm.Normal('obs', ...)

4, for any new data $y_k$, perform filtering:

for idx in range(trace_lenght):
    param = trace[idx]
    # plug in the new observation
    param['obs'] = y_k
    # evaluate the log-likelihood using the new model
    loglike[idx] = model2.logp(point=param)

# update particles using loglike, save it in new trace

@hvasbath programmed the SMC sampler in PyMC3, so it would be great for him to weight in on this as well.

kochede · June 24, 2017, 12:38pm

Thanks a lot, Junpeng for suggestion!

Let me think about your suggestion in a few hours when I’ll get back home, meanwhile, I’d like to clarify my question a bit:

My state-space model has 1) parameters theta and 2) unobserved states x_t.

I could estimate the parameters \theta using maximum likelihood using data y1,…,yT, but I want to try to do it by MCMC.
After I estimated the parameters \theta (have their posterior distribution from pyMC3), I’d like to use the estimated model to do filtering – estimate the unobserved states x_{T+1}, x_{T+2}, etc, sequentially obtaining observations y_{T+1}, y_{T+2}, …
Often this is done by the sequential monte-carlo aka particle filter.

A related problem – say I don’t want to estimate the parameters \theta, I just want to fix them at some values and then I want to do filtering for hidden states x1,…,xT to see how the result looks like. Again, this is done by sequential monte-carlo.
What MCMC would give me in this case is distribution of each x_t conditional on all the data y1,…,yT.
But what filtering should give is the distribution of each x_t conditional on only the preceding data y_1,…,y_t.

junpenglao · June 24, 2017, 3:43pm

Thanks for the clarification, I think what I described above should work. Essentially, the PyMC3 model you build under the pm.Model() context is a function you can evaluate, and the MCMC gives you a posterior distribution in the form of samples (i.e., the trace). So if you took the trace and updated it using new data manually using resampling, it should work.

And to fixed some parameters and perform SMC, you can set those parameters as theano.shared values, and used set_value to evaluate the function.

Finally, to estimate on the preceding data, your model should have the possibility to take new input (y_t-1 to predict y_t if it is Markov Process) and then evaluate. Again, setting some parameter as theano.shared variable should do the job. You can have a look at some sample_ppc example on linear model with does the same thing.

jengelman · July 8, 2017, 6:57am

Would you also be able to do this sort of online updating with one of the variational methods? I saw that sample_ppc now returns a “pseudo-trace” for ADVI, and was wondering if I would be able to use VI instead of MCMC for state space models.

junpenglao · July 11, 2017, 2:38pm

In principle, it should be possible. However, I am not sure it is practical in PyMC3. Currently, PyMC3 create an Approximation class using the user specified model (@ferrine correct me if I am wrong here), and you then used an optimizer to estimate the parameter (SGD for example). Sample_ppc is just to generate “pseudo-trace” so that we can use the same traceplot etc. But for a filtering problem, it is easier to take the approximation object as a function and evaluated/update the parameters. I am not sure how to do it exactly, but I am planning to write a notebook to see if my idea above would work, maybe @ferrine can chip in when we have a concrete model.

jengelman · July 19, 2017, 2:37am

@junpenglao @ferrine Let me know if there’s any way I can contribute. I’m working right now on implementing sequential updating of the Bayesian Gaussian mixture model for scikit-learn, but it’s limited to mean-field inference and written entirely in pure Python so pretty slow. I’d love to be able to use pymc3 instead for general variational filtering instead if you guys figure it out.

ferrine · July 19, 2017, 7:02am

For a first glance you need AEVB here with sequential model obtaining p(x_t+1|y_1:t,x_1:t). I think that is the way to go for Variational methods. @taku-y can correct me if I’m wrong

junpenglao · July 19, 2017, 7:06am

It would be easier once your implementation is ready. I am very interested in what you are doing - I used the BayesianGMM in scikit-learn before but it is not a real Bayesian Inference…

ysgclight · December 23, 2018, 4:29pm

Hi, as a newcomer, I’m still a little bit confused about “So if you took the trace and updated it using new data manually using resampling, it should work.”. Could you please explain more about this?

By the way, I think my question here is equivalent to how to set a prior with a former sampling result (i.e. a trace variable). Maybe this can help clarify my confusion.

Gon_F · May 24, 2019, 10:56pm

I am very interested in this discussion, since SMC seems like the best way to do time-series analysis with Bayesian methods (something that no Bayesian library does well at the moment, afaik).

Have there been any updates on this? I would like to help out on this topic, since time-series fascinate me. SMC seems to have been implemented by now (as junpenglao implied), but none of the time-series examples seem to use it for some reason.

junpenglao · May 25, 2019, 5:43am

It would be great to test on time-series model. However, IIUC SMC inference on time series is usually done different than the SMC framework here, as people usually do one step SMC update. But in pymc3 the SMC is inferencing the model conditioned on all time steps being observed.

Gon_F · May 25, 2019, 5:36pm

In that case, which pieces of the current implemention would be most useful to grab when fashioning it for time-series? Which pieces would you recommend me to look at in the codebase, junpenglao?

junpenglao · May 26, 2019, 6:25am

I have been thinking that you probably can run normal inference (using NUTS etc), make one step prediction, use the prediction to compute the weighting, and do resample similar to:

github.com

pymc-devs/pymc3/blob/83172819ee14b03c01b4b2264724b513cb7f0603/pymc3/step_methods/smc.py#L162-L169


# resample based on plausibility weights (selection)
resampling_indexes = np.random.choice(np.arange(draws), size=draws, p=weights)
posterior = posterior[resampling_indexes]
likelihoods = likelihoods[resampling_indexes]


# compute proposal distribution based on weights
covariance = _calc_covariance(posterior, weights)
proposal = MultivariateNormalProposal(covariance)

VIDYA · October 9, 2020, 1:48am

Hi,

I’m tryng to model a particle filter using pymc3. Initially I build a model using pm.Model() command. According to my knowledge, after building the model, I need to compute the weights of posterior in such a way that those with high weights(ie, samples near the model) should be retained and others need to be discarded. Is there any way to compute the weights(probabilty) of the posterior so that resampling can be done after that.
Please excuse me if I’m wrong.

Thanks in advance.Please help me to get through this

VIDYA · October 12, 2020, 6:49am

Can we use pm.sample_smc() for doing this whole portion

junpenglao · October 12, 2020, 7:25am

Are you working with a time series? If so pm.sample_smc() is probably not what you want as that’s for static models.

Feel free to open a new thread that better describe your problem.

VIDYA · October 12, 2020, 8:25am

Thank you for fast reply.
I’m trying to design a time series prediction model.
According to my understanding,

with pm.Model() as model_particle:
x = pm.Gamma(‘x’, 1, 10) #prior
y = pm.Poisson(‘y’, x, observed=data) #likelihood
trace = pm.sample(1000) #posterior computation

this part will compute the posterior calculation. I tink I’m correct till this.
After this I need to resample the data based on weights(particle filter), the one with more weights should be retained and others need to be discared. the one with more weights should be reproduced and again sampling has to be done.
I’m actually stuck in the above explained portion.I’m confused in calculating the weights and resampling.Can you please help me.
Actually I felt like this thread is related to particle filter,So I replied in this.Sorry for again trying to clarify my doubt here.Please excuse me.

Thanks in advance
Vidya

Topic		Replies	Views
Particle filter Prediction using pymc3 Questions	2	1922	October 27, 2020
Using MCMC based model in closed loop Questions	14	2493	July 29, 2019
Creating a Data Fusion with PyMC3 Questions	7	1248	May 4, 2021
"Local level - Nile" State Space Model (Kalman Filter) in PyMC3 Development	9	3644	April 1, 2021
State Space Models in PyMC v3	26	4479	May 18, 2022

Filtering (e.g. particle filter, sequential MC)

Related topics