If I use
trace_beta1 = pm.sample(1000, step=step, start=start, njobs=10, tune=1000)
ten chains are indeed produced. However, if I use
trace_beta1 = pm.sample(1000, start=start, chains=10, tune=1000)
only one chain appears to be produced, e.g. afterwards trying
ValueError: Gelman-Rubin diagnostic requires multiple chains of the same length, which implies only a single chain.
It would appear logical that
chains should specify the number of chains to produce, and
njobs should specify how many should be produced at one time.
chains actually do, and why do I seemingly have to use
njobs to produce ‘parallel chains’ (is this the right term?)?
Hmm that really shouldn’t be the case, and your understanding is correct. What is your PyMC3 version?
Version 3.2. Here’s the code:
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import seaborn as sns
with pm.Model() as our_first_model:
theta = pm.Beta('theta', alpha=1, beta=1)
y = pm.Bernoulli('y', p=theta, observed=data)
start = pm.find_MAP()
step = pm.Metropolis()
trace = pm.sample(1000, step=step, start=start, chains=10)
burnin = 100
chain = trace[burnin:]
ValueError Traceback (most recent call last)
<ipython-input-10-3e954059917e> in <module>()
----> 1 pm.gelman_rubin(chain)
C:\ProgramData\Anaconda3\lib\site-packages\pymc3\diagnostics.py in gelman_rubin(mtrace,
144 if mtrace.nchains < 2:
145 raise ValueError(
--> 146 'Gelman-Rubin diagnostic requires multiple chains '
147 'of the same length.')
ValueError: Gelman-Rubin diagnostic requires multiple chains of the same length.
I see, yeah this is recently changed, could you please update to master?
pip install git+https://github.com/pymc-devs/pymc3
Edit: Success! Though I had to use
pip install --upgrade git+https://github.com/pymc-devs/pymc3
More odd behaviour. Running the Rugby model at http://docs.pymc.io/notebooks/rugby_analytics.html, I end up with four chains:
In : trace
Out: <MultiTrace: 4 chains, 1000 iterations, 10 variables>
I have updated from using
git once again (today) and also found that
tune=2000 had to be used as with
tune=1000 I was getting the following:
UserWarning: The acceptance probability in chain 3 does not match the target.
It is 0.0251218722568, but should be close to 0.8. Try to increase the number of tuning steps.
trace = pm.sample(1000, tune=2000, chains=1) results in a single chain, so it looks like something has set the default number of chains to four.
Yes you are right the default number of chains is 4 now (running multiple chain is important for model diagnostic)
I’ve submitted a PR to improve the docstring for
chains. It will select the higher of
njobs or 2. Most of the time you will want to sample in parallel to accomodate Gelman-Rubin diagnostic calculation. So, when you set
njobs to 1 there will still be 2 chains sampled, it will just occur in serial (unless you set
chains to 1 as well).
Note, however, when you ask for 1000 samples (by setting
iterations=1000, you will get 1000 samples, it will just be broken out over however many chains are specified.