Undefined term "chain"

(This is a very small and subtle suggestion.)

In the dev documention, in the “Introductory Overview of PyMC”,
the term “chain” is introduced without being defined.

It appears twice in the introductory text (the Abstract and Introduction) sections, but that is in the general sense of Markov chains. I am referring to the technical or API sense in which it appears in subsequent text, for example, in the output examples
Sequential sampling (2 chains in 1 job)
Sampling 2 chains for 1_000 tune and 1_000 draw iterations (2_000 + 2_000 draws total) took 5 seconds.
and in the text, “Notice that sample generated a set of parallel chains, depending on how many compute cores are on your machine.”

In those places, one gets the impression that a chain is a software entity, specific to PyMC, i.e., a data type or structure (probably representing a realization of a Markov process etc.). It would be helpful if the text clarified that, the first time the term appears in that sense. (Is it, in fact, a PyMC data type, for example? I am sure I will learn as I go on, but I’m trying to provide “first impression” feedback on any rough edges I find in the docs before, before I become inured to them : )

I hope you find this feedback helpful. Thank you for considering it.

1 Like

Both things are using “chain” as the same concept. The sampler or step function defines how to iterate and get from one draw/sample/iteration of the Markov Chain to the next one. From MCMC theory we know that as the number of samples tends to infinite the target distribution will be recovered.

However, in practice we can never have infinite iterations so we never get any guarantee that the samples we have are from the target distribution. What we can do however is identify many situations in which we can be sure they are not. We call this that sampling has not converged. These “convergence diagnostics” generally rely on not generating only a single MCMC chain (a bit redundant but I hope clear) but on generating multiple independent chains. If the samples from each chain don’t represent the same distribution then we know that sampling has not converged (if they do then we continue to not have any guarantee as we still don’t have infinite samples, but it is already very useful to have these diagnostics)

I hope that clarified things a bit, let me know if it doesn’t and if you’d be interested in doing some improvements to the notebook yourself.