Sequential monte carlo - need help? [SOLVED]

madanh · January 4, 2018, 9:25pm

I want to try SMC on a my model that is know to have a bimodal posterior with the given dataset. However I’m having troubles adapting the example here to use an already written generative model. Is it even possible or do I have to write the likelihood function from scratch?

Furthermore, I don’t quite understand the role of X - the Uniform in the example. Is this the prior?

Maybe there is another example/tutorial to look at?

Please help, TIA

junpenglao · January 4, 2018, 9:42pm

You can have a look at another example here: https://github.com/junpenglao/GLMM-in-Python/blob/master/pymc3_different_sampling.py
In short, change the transform to none in your original model and you can sample using SMC usually without problem.

madanh · January 4, 2018, 10:32pm

Thanks , just tried this. Stumbling on LKJCholeskyCov not having a transform parameter and not having a random attribute. Any suggestions?

junpenglao · January 4, 2018, 10:53pm

Try LKJCorr… There is no good way around it as random method for LKJCholeskyCov is not implemented.

madanh · January 6, 2018, 5:57pm

Thanks @junpenglao got it to work. The results are disappointing though.

junpenglao · January 6, 2018, 6:14pm

disappointing in terms of too slow? or mixing not good?

madanh · January 6, 2018, 7:01pm

Bad mixing. It uses metropolis under the hood. As far as I understood from the code the covariance matrix is estimated from the whole population at every temperature and a global Gaussian proposal is made with that matrix. Given this and the fact that I have two well separated modes bad mixing is not surprising. On a different topic I’ve managed to get DEMetropolis going and it just stands still, nothing being accepted. Oh-well.

For this model and dataset the best results I’ve had are from emcee (with their parallel tempering implementation), but it takes a week of sampling to get there and it also mixes poorly.

It is sort of obvious that with multimodal posteriors anything that uses a global proposal is doomed to fail whatever tempering regime is chosen. I wish there was some sort of ensemble NUTS implementation that would consider moving a particle to the exact location of another particle based on their odds ratio once in while.

Apparently there’s still a long way to the inference button.

junpenglao · January 6, 2018, 8:35pm

And it is not possible to use NUTS? What kind of model are you trying to sample?

If you run NUTS with lots of chains and different starting point, it can kind of being an ensemble method (technically, it is not completely correct: if the mode is well separated, the weighting will not be right as each chain just stuck at one mode).

madanh · January 6, 2018, 9:19pm

It is possible. NUTS works perfectly when the data provides for a unimodal posterior, but I have one particular important (real clinical data) dataset that happens to give two modes with approximately the same mass. What happens is that each chain gets poorly initialized and the sampling is inefficient even after relatively long tuning, I suppose this is because in this multimodal situation ADVI estimates something that is neither the first nor the second mode.

Just to clarify: This is for an academic article. It is not a major issue, I already have the answer from emcee, but in the article that I’m writing it is sort of stupid to write that I used one sampler for this and another for that, so I want to find something robust and fast that can be cited. This is the “proverbial” 90% of the work that are going to bring about 10% of the results.

I can show you my model and you will see why NUTS is struggling (I’m really abusing it), but you have to confirm that you have spare time and interest. Because explaining what is happening in the model will take quite some of my time which is scarce ATM.

As for the question in the topic I’m glad that I tried SMC and DEMetropolis , thanks to you (!), and (sort of) understood that they are a dead-end.

junpenglao · January 6, 2018, 9:36pm

Sounds like you already have answer, so I will just comment on the following point:

In this case, try the new initialization jitter+adapt_diag, if you know where the multimode is, you can supply good starting values (i.e., one mode for one chain), and NUTS should explore each mode in sperate chain quite well.

madanh · January 8, 2018, 8:41am

That’s a good idea, wiil try, thanks.

Topic		Replies	Views
Sampling from multimodal posterior version agnostic	1	576	September 21, 2022
Get maximum likelihood of a variable across chains	13	200	June 17, 2024
Difference in posterior between Metropolis and NUTS Sampler v5 bug , modeling , sampling	3	308	March 22, 2024
How to average multiple chains? v5	16	1100	April 20, 2023
Chains converge to local optima? version agnostic gaussian_process , modeling	6	84	October 11, 2024

Sequential monte carlo - need help? [SOLVED]

Related topics