SMC vs sample_smc (TMCMC vs CATMIP)

mukesh · February 7, 2019, 7:24pm

In smc.py file, I see a class named SMC and a def named sample_smc. I read the references mentioned in both SMC class and sample_smc definition they use TMCMC and CATMIP algorithm respectively. I realize that both the sampling techniques are slightly different. In this regard, I have the following questions:

In this link https://docs.pymc.io/notebooks/SMC2_gaussians.html, which algorithm is being used? Is it CATMIP which is under sample_smc definition? If yes, how do I solve the same problem using TMCMC
How do I invoke a specific algorithm if want to do the sampling by an algorithm of my choice (TMCMC or CATMIP)

junpenglao · February 7, 2019, 8:42pm

@aloctavodia?

aloctavodia · February 7, 2019, 9:20pm

Hi @mukesh.

Sorry for the confusion I will make the docstring more clear.

TMCMC and CATMIP are both slightly different versions of the family of Sequential Monte Carlo (SMC) algorithms. There is only one SMC method implemented in PyMC3 and is based on those two algorithms, with a few additions (that is should check if they are properly cited).

To used SMC in PyMC3 you write something like pm.sample(step=pm.SMC()) as explained in the notebook you mentioned. The function sample_smc is used internally by PyMC3.

I hope this helps.

mukesh · February 7, 2019, 10:38pm

Thank you for the reply

It would be great if you could explain the method that you used (which is based on both TMCMC and CATMIP) either in the docstring or here.

aloctavodia · February 8, 2019, 12:37pm

SMC works by moving from successive stages. At each stage the inverse temperature \beta is increased a little bit (starting from 0 up to 1). When \beta = 0 we have the prior distribution and when \beta =1 we have the posterior distribution. So in more general terms we are always computing samples from a tempered posterior that we can write as:

p(\theta \mid y)_{\beta} = p(y \mid \theta)^{\beta} p(\theta)

A summary of the algorithm is:

Initialize \beta at zero and stage at zero.
Generate N samples S_{\beta} from the tempered posterior (because \beta = 0 this is the prior).
Increase \beta in order to make the effective sample size equals some predefined value (we use N*t, where t is 0.5 by default).
Compute a set of N weights W. The weights are computed according to the new
tempered posterior.
Obtain S_{w} by re-sampling according to W.
Use W to compute the covariance for the proposal distribution.
For stages other than 0 use the acceptance rate from the previous stage to estimate the scaling of the proposal distribution and n_steps.
Run N Metropolis chains (each one of length n_steps), starting each one from a different sample in S_{w}.
Repeat from step 3 until \beta \ge 1.
The final result is a collection of N samples from the posterior.

Adding this description to the docstring is a good idea, thanks for the suggestion.

mukesh · February 8, 2019, 4:37pm

Excellent! Thank you for describing the algorithm.

mukesh · February 8, 2019, 9:38pm

Could you elaborate your explanation about point no 7?

For example, if the acceptance rate at stage m is x, how do you use this to scale the proposal and n_steps of stage m+1? In addition, I see that there is a tuning interval is set to 10, so do you scale the proposal every tuning interval steps within each stage?

aloctavodia · February 8, 2019, 10:45pm

It seems to me you are looking at an older version of SMC. Please see the current version.

the n_steps are computed here, the intuition is “how many step do I need to accept on average x of them” and the scaling factor is computed here.

mukesh · February 12, 2019, 3:49am

Oh that you for pointing that out. I was looking at the old version of SMC. After updating, your algorithm matches well.

I don’t see any parallelization of the SMC samples in the current version of SMC, any thoughts on this?

aloctavodia · February 12, 2019, 10:06am

The version on master uses multiprocessing for parallelization, this take advantages of multicore processors. I just add it that feature a few days ago. I would like to also have parallelization using MPI or something like that to have parallelization over a cluster.

Topic		Replies	Views
Pm.step_methods() doesn't include pm.SMC() in pymc3 v 3.8 Questions	6	820	April 17, 2020
Using multiple chains in SMC in PyMC3 Questions smc_abc	9	1560	December 3, 2021
Scope of implementing ABC,SMC-ABC in PYMC4 as GSOC 2019 Project Development gsoc	10	1149	March 3, 2019
SMC Sampler references version agnostic smc	6	357	January 22, 2024
New MCMC method Development	2	659	September 7, 2018

SMC vs sample_smc (TMCMC vs CATMIP)

Related topics