In August '22, I presented Chainsail, an internal project my colleagues and me over at Tweag have been working on for quite a while. It is a cloud-based web service that implements Parallel Tempering / Replica Exchange to help with sampling multimodal probability distributions. You can learn more about Chainsail in our announcement blog post.
Back then, we invited users to participate in a beta test, but it was a closed-source project. Today we’re happy to announce that Chainsail is now open source!
The Chainsail development team encourages all kinds of contributions to the Chainsail code base and is looking forward to see reuses of parts of the Chainsail code for other projects. A blog post on the occasion of the open-source release outlines the service architecture, points to relevant parts of the source code and proposes a couple of future extensions we at Tweag would, together with interested community members, enjoy working on.
Neat idea. Do you think the architecture allows for easy message passing within chains to accommodate large models?
Swapping state elements for parallel tempering makes sense, but for models with okay mixing but very, very large datasets & parameter sets, it might be nice to give each worker a subset of the data and then use message passing for the Gibbs updates across workers.
@ckrapu I’m not sure I fully understand what you would like to do. Off the top of my head I’d say that the existing architecture (and here I think this really only concerns the MPI Replica Exchange implementation) can be easily adapted to make a kind of distributed Gibbs sampler. MPI is only used to pass around the state and the log-probabilities, and for a distributed Gibbs sampler you would just need to modify that part to pass around the state in one sense only (from, say, replica A to B, instead of also B and A as in Replica Exchange) and get rid of the log-probability passing.
But given the sequential nature of Gibbs sampling, wouldn’t you need the sampler for one variable always have to wait for a new value of a different variable to arrive? Seems like this would not allow for actual parallel sampling then.
Also, how does splitting up the data lead to an algorithm that yields samples from the desired distribution (of all parameters given the full data)?
I’d love to understand better - I’d also be happy to hop on a call to discuss, so I can give a more useful answer here