I’m new to SMC and encountering high rhats when using default settings on a particular model. What knobs would folks suggest playing with to attempt to resolve this? If using MCMC, I’d expand the warmup period and mess with inits for example, but I have no idea what options are available with SMC.
Try incresing the number of draws. This could help not only because you get more samples (as with MCMC) but becuase this also means more particles exploring the posterior.
Another option is to decrease correlation_threshold. Indirectly this increases the number of MCMC steps. Check pymc.smc.sample_smc — PyMC 5.9.1 documentation for details
If that not work you can try runing sample_smc twice and using the output of the first run as the start
argument of the second run. If you do this the computation of the marginal likelihood is no longer valid.
Could you clarify on this point? Or direct me to a relevant article?
This has two parts, not sure which one you want clarification on. So let me clarify both:
Initialize SMC with SMC:
We can initialize SMC from any distribution we want. It is common to start from the prior distribution because it is easy to sample. But any distribution will work, in fact, starting from some distribution closer to the posterior is a better idea, as we will need fewer intermediate steps to reach the posterior distribution. Usually, we don’t know how to do this. Empirically I have observed that for some models using an SMC run to initialize another SMC run, can be an effective way to get better samplers. There is no systematic study showing this is indeed beneficial or when. Is just an observation.
Marginal likelihood computation:
Suppose we perform an SMC with four stages, from \beta_0=0 to \beta_2=1 at each stage we approximate \int p(\theta) p(D | \theta)^{\beta_{i}} d\theta. Actually for SMC what we care is the ratio \frac{\int p(\theta) p(D | \theta)^{\beta_{i+1}} d\theta}{\int p(\theta) p(D | \theta)^{\beta_{i}} d\theta}, because this provides as with the weights to re-weight the particles from one stage to the next. Anyway, the value of the marginal likelihood that SMC returns can be expressed as:
We can simplify this expression:
The 1 on the denominator comes from p(D | \theta)^{\beta_{0}}=1 for \beta = 0 (a number to the power of 0 is 1) and \int p(\theta) d\theta = 1 (the prior integrates to 1).
The terms with \beta_1 cancel each other, thus we get \int p(\theta) p(D | \theta)^{\beta_{2}=1} d\theta which is the marginal likelihood.