I am trying to test out different backends for pymc3 to deal with an extremely large trace file that we have. Please before you ask me to shrink the model, understand that we are dealing with a model with over 2000 values for just one of the parameters, so its just going to be large. That being said none of the standard backends seem to work for pymc3==3.3
:
I was going to create an issue in Github, but I wanted to see if I have something wrong in this setup. Any help would be greatly appreciated:
17:20 $ python
Python 3.6.1 (default, Nov 8 2017, 14:29:33)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymc3 as pm
>>> model = pm.Model()
>>> with model:
... a = pm.Normal('a', mu=0, sd=1)
... trace = pm.sample(1000, n_init=1000, cores=1, njobs=1)
...
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (2 chains in 1 job)
NUTS: [a]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1500/1500 [00:01<00:00, 1445.02it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1500/1500 [00:00<00:00, 2644.67it/s]
Text Backend Fails:
17:11 $ python
Python 3.6.1 (default, Nov 8 2017, 14:29:33)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymc3 as pm
>>> model = pm.Model()
>>> with model:
... a = pm.Normal('a', mu=0, sd=1)
... db_text = pm.backends.Text("text-test-42")
... trace = pm.sample(1000, n_init=1000, trace=db_text, cores=1, njobs=1)
...
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (2 chains in 1 job)
NUTS: [a]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1500/1500 [00:00<00:00, 2404.39it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1500/1500 [00:00<00:00, 2485.21it/s]
Traceback (most recent call last):
File "<stdin>", line 3, in <module>
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/sampling.py", line 439, in sample
trace = _sample_many(**sample_args)
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/sampling.py", line 494, in _sample_many
return MultiTrace(traces)
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/backends/base.py", line 265, in __init__
raise ValueError("Chains are not unique.")
ValueError: Chains are not unique.
SQLite Fails:
17:16 $ python
Python 3.6.1 (default, Nov 8 2017, 14:29:33)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymc3 as pm
>>> model = pm.Model()
>>> with model:
... a = pm.Normal('a', mu=0, sd=1)
... db_sqllite = pm.backends.SQLite("test-sqllite")
... trace = pm.sample(1000, n_init=1000, trace=db_sqllite, cores=1, njobs=1)
...
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (2 chains in 1 job)
NUTS: [a]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1500/1500 [00:00<00:00, 2328.49it/s]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1500/1500 [00:00<00:00, 2453.27it/s]
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/sampling.py", line 439, in sample
trace = _sample_many(**sample_args)
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/sampling.py", line 494, in _sample_many
return MultiTrace(traces)
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/backends/base.py", line 265, in __init__
raise ValueError("Chains are not unique.")
ValueError: Chains are not unique.
HDF5 Fails:
16:55 $ python
Python 3.6.1 (default, Nov 8 2017, 14:29:33)
[GCC 4.2.1 Compatible Apple LLVM 7.0.2 (clang-700.1.81)] on darwin
Type "help", "copyright", "credits" or "license" for more information.
>>> import pymc3 as pm
>>> model = pm.Model()
>>> with model:
... a = pm.Normal('a', mu=0, sd=1)
... db = pm.backends.HDF5('test-hdf5-3')
... trace = pm.sample(1000, n_init=1000, trace=db, cores=1, njobs=1)
...
Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
Sequential sampling (2 chains in 1 job)
NUTS: [a]
100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 1500/1500 [00:07<00:00, 193.66it/s]
0%| | 0/1500 [00:00<?, ?it/s]
Traceback (most recent call last):
File "<stdin>", line 4, in <module>
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/sampling.py", line 439, in sample
trace = _sample_many(**sample_args)
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/sampling.py", line 482, in _sample_many
step=step, random_seed=random_seed[i], **kwargs)
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/sampling.py", line 526, in _sample
for it, strace in enumerate(sampling):
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/tqdm/_tqdm.py", line 862, in __iter__
for obj in iterable:
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/sampling.py", line 614, in _iter_sample
strace.setup(draws, chain, step.stats_dtypes)
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/backends/hdf5.py", line 154, in setup
self._set_sampler_vars(sampler_vars)
File "/Users/orion.delwaterman/.pyenv/versions/3.6.1/envs/hiring-horizons/lib/python3.6/site-packages/pymc3/backends/base.py", line 80, in _set_sampler_vars
raise ValueError("Can't change sampler_vars")
ValueError: Can't change sampler_vars
>>> model.unobserved_RVs
[a]