SMC works by moving from successive stages. At each stage the inverse temperature \beta is increased *a little bit* (starting from 0 up to 1). When \beta = 0 we have the prior distribution and when \beta =1 we have the posterior distribution. So in more general terms we are always computing samples from a *tempered posterior* that we can write as:

p(\theta \mid y)_{\beta} = p(y \mid \theta)^{\beta} p(\theta)

A summary of the algorithm is:

- Initialize \beta at zero and
*stage* at zero.
- Generate N samples S_{\beta} from the tempered posterior (because \beta = 0 this is the prior).
- Increase \beta in order to make the
*effective sample size* equals some predefined value (we use N*t, where t is 0.5 by default).
- Compute a set of N weights W. The weights are computed according to the new

tempered posterior.
- Obtain S_{w} by re-sampling according to W.
- Use W to compute the covariance for the proposal distribution.
- For stages other than 0 use the acceptance rate from the previous
*stage* to estimate the *scaling* of the proposal distribution and *n_steps*.
- Run N Metropolis chains (each one of length
*n_steps*), starting each one from a different sample in S_{w}.
- Repeat from step 3 until \beta \ge 1.
- The final result is a collection of N samples from the posterior.

Adding this description to the docstring is a good idea, thanks for the suggestion.