Strange problems with LKJ example

rosgori · July 22, 2019, 4:38am

You can read the LKJ example here. For a minimal example, take the following code:

import matplotlib.pyplot as plt
import numpy as np
import pymc3 as pm

SEED = 3264602 # from random.org
np.random.seed(SEED)

N = 10000

μ_actual = np.array([1, -2])
Σ_actual = np.array([[0.5, -0.3],
                     [-0.3, 1.]])

x = np.random.multivariate_normal(μ_actual, Σ_actual, size=N)

with pm.Model() as model:
    packed_L = pm.LKJCholeskyCov('packed_L', n=2,
                                 eta=2., sd_dist=pm.HalfCauchy.dist(2.5))

    L = pm.expand_packed_triangular(2, packed_L)
    Σ = pm.Deterministic('Σ', L.dot(L.T))

    μ = pm.Normal('μ', 0., 10., shape=2,
                  testval=x.mean(axis=0))
    obs = pm.MvNormal('obs', μ, chol=L, observed=x)


with model:
    trace = pm.sample(random_seed=SEED, cores=4)


print(pm.summary(trace))
pm.traceplot(trace, varnames=['μ']);
plt.show()

I cannot reproduce the results because this appears:

Sampling 4 chains:   0%|                                                | 0/4000 [00:00<?, ?draws/s]

It doesn’t sample. Why? I don’t know. With my current environment, this problem appears (python 3.6, pymc3 3.6 and theano 1.0.4). If I use a virtual environment (using conda), the problem persists (python 3.7 or python 3.6, pymc3 3.6 or pymc3 3.7, and theano 1.0.4). The most strange part is that if I use a virtual environment using virtualenv (python 3.6, pymc3 3.6 or pymc3 3.7, theano 1.0.4), it does sample! Of course, the usual warnings are shown:

WARNING (theano.configdefaults): install mkl with `conda install mkl-service`: No module named 'mkl'
WARNING (theano.tensor.blas): Using NumPy C-API based implementation for BLAS functions.

My question is simple: what is happening? Do you have the same problem if you run the code?

Let me give another example. The code is:

from scipy import stats
import numpy as np
import matplotlib.pyplot as plt
import pymc3 as pm
import theano.tensor as tt

#Parameters to set
N = 1000

mu_x = 2
variance_x = 5
mu_y = 5
variance_y = 9

mu_actual = np.array([mu_x, mu_y])
sigma_actual = np.array([[variance_x, 4], [4, variance_y]])

z = np.random.multivariate_normal(mu_actual, sigma_actual, size=N)
print(z)
print(z.shape)

with pm.Model() as model:
#    sd_dist = pm.HalfCauchy.dist(beta=2.5)

    packed_L = pm.LKJCholeskyCov('packed_L', n=2, eta=2., 
                                    sd_dist=pm.HalfCauchy.dist(beta=2.5))
    L = pm.expand_packed_triangular(2, packed_L)
    sigma_may = pm.Deterministic('sigma_may', L.dot(L.T))
    
    alpha = pm.Normal('alpha', mu=0, sd=10, shape=2, testval=z.mean(axis=0))
    
    vals = pm.MvNormal('vals', mu=alpha, chol=L, observed=z)
    
print(model.check_test_point())    
        
with model:
    trace = pm.sample()

print(pm.summary(trace))
pm.traceplot(trace, varnames=['alpha'])
plt.tight_layout()
plt.show()

With the above code, it runs with my current environment, no problem here. As you can see, it is almost the same code. Why does this work? No idea. Although I have to say that the code doesn’t run in my virtual environment (conda, python 3.7.3, pymc3 3.7 and theano 1.0.4).

There is another problem, and for that let’s take the first example. If I use scipy v. 1.2.0, it takes ten seconds:

Sampling 4 chains: 100%|████████████████████████████████████| 4000/4000 [00:12<00:00, 320.54draws/s]

If I use scipy v. 1.2.1, it takes approximately one minute:

Sampling 4 chains: 100%|█████████████████████████████████████| 4000/4000 [01:12<00:00, 55.55draws/s]

And with scipy v. 1.3.0, it takes one minute:

Sampling 4 chains: 100%|█████████████████████████████████████| 4000/4000 [01:04<00:00, 61.66draws/s]

All of this was done with virtualenv, python 3.6.6, numpy 1.16.4, pymc3 3.7 and theano 1.0.4. I don’t know if I should open an issue in Github or the behavior is normal.

aseyboldt · July 22, 2019, 9:36am

No, that is not normal. Thanks for reporting this
I’m really not sure what’s happening, both example work on my machine.
The second problem sounds as if it could be related to blas. (The second warning also hints at that). Maybe theano is falling back on a python implementation for some reason?
Some things you could try to help us debug this:

What operating system are you using?
Output of np.__config__.show() (For both the env with scipy 1.2.0 and scipy 1.3.0 or 1.2.1)
Run the first example (the one where the sampler doesn’t seem to do anything) with pm.sample(cores=1) That disables parallel sampling and might tell us if it is related to that.
Show the output of

func = model.logp_dlogp_function()
theano.printing.debugprint(func._theano_function)

for one scipy version where sampling is slow and for one where it is fast.

You can post that with your original post on github, that is a better place for bugs.

rosgori · July 22, 2019, 6:13pm

With pm.sample(cores=1), the sampler works, and that is weird.

tboutelier · October 21, 2019, 2:00pm

I have the same problem, which is solved by setting cores=1, as suggested by @rosgori. I am running on windows, maybe it is related?

Topic		Replies	Views
Results not fully reproducible Questions bug	1	722	May 10, 2022
Latest PyMC3 from git fails to sample Questions	5	657	November 13, 2017
Sampler initialization error with a model containing an LKJCholeskyCov distribution bug	5	343	July 17, 2023
Approximate Bayesian Computation example bug? Questions	7	739	September 1, 2020
Sample method of pymc3 Questions	4	669	July 15, 2018

Strange problems with LKJ example

Related Topics