If I use
trace_beta1 = pm.sample(1000, step=step, start=start, njobs=10, tune=1000)
ten chains are indeed produced. However, if I use
trace_beta1 = pm.sample(1000, start=start, chains=10, tune=1000)
only one chain appears to be produced, e.g. afterwards trying
pm.gelman_rubin(chain_beta1)
gives ValueError: Gelman-Rubin diagnostic requires multiple chains of the same length
, which implies only a single chain.
It would appear logical that chains
should specify the number of chains to produce, and njobs
should specify how many should be produced at one time.
What does chains
actually do, and why do I seemingly have to use njobs
to produce ‘parallel chains’ (is this the right term?)?
Hmm that really shouldn’t be the case, and your understanding is correct. What is your PyMC3 version?
Version 3.2. Here’s the code:
import matplotlib.pyplot as plt
import numpy as np
from scipy import stats
import seaborn as sns
with pm.Model() as our_first_model:
theta = pm.Beta('theta', alpha=1, beta=1)
y = pm.Bernoulli('y', p=theta, observed=data)
start = pm.find_MAP()
step = pm.Metropolis()
trace = pm.sample(1000, step=step, start=start, chains=10)
burnin = 100
chain = trace[burnin:]
pm.gelman_rubin(chain)
Output:
---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-10-3e954059917e> in <module>()
----> 1 pm.gelman_rubin(chain)
C:\ProgramData\Anaconda3\lib\site-packages\pymc3\diagnostics.py in gelman_rubin(mtrace,
varnames, include_transformed)
144 if mtrace.nchains < 2:
145 raise ValueError(
--> 146 'Gelman-Rubin diagnostic requires multiple chains '
147 'of the same length.')
148
ValueError: Gelman-Rubin diagnostic requires multiple chains of the same length.
I see, yeah this is recently changed, could you please update to master?
pip install git+https://github.com/pymc-devs/pymc3
Edit: Success! Though I had to use pip install --upgrade git+https://github.com/pymc-devs/pymc3
More odd behaviour. Running the Rugby model at http://docs.pymc.io/notebooks/rugby_analytics.html, I end up with four chains:
In [9]: trace
Out[9]: <MultiTrace: 4 chains, 1000 iterations, 10 variables>
I have updated from using git
once again (today) and also found that tune=2000
had to be used as with tune=1000
I was getting the following:
D:\Continuum\Anaconda3\lib\site-packages\pymc3\step_methods\hmc\nuts.py:452:
UserWarning: The acceptance probability in chain 3 does not match the target.
It is 0.0251218722568, but should be close to 0.8. Try to increase the number of tuning steps.
Edit: using trace = pm.sample(1000, tune=2000, chains=1)
results in a single chain, so it looks like something has set the default number of chains to four.
Yes you are right the default number of chains is 4 now (running multiple chain is important for model diagnostic)
1 Like
I’ve submitted a PR to improve the docstring for chains
. It will select the higher of njobs
or 2. Most of the time you will want to sample in parallel to accomodate Gelman-Rubin diagnostic calculation. So, when you set njobs
to 1 there will still be 2 chains sampled, it will just occur in serial (unless you set chains
to 1 as well).
Note, however, when you ask for 1000 samples (by setting iterations=1000
, you will get 1000 samples, it will just be broken out over however many chains are specified.