Error when running GLM-logistic.ipynb: Bad initial energy: inf

I’m trying to run the GLM logistic demo notebook with pymc3 version 3.5 on Ubuntu 18.04, pymc3 installed with conda, and get this error in cell [12]:

with pm.Model() as logistic_model:
    pm.glm.GLM.from_formula('income ~ age + age2 + educ + hours', data, family=pm.glm.families.Binomial())
    trace_logistic_model = pm.sample(2000, chains=1, tune=1000)

Auto-assigning NUTS sampler...
Initializing NUTS using jitter+adapt_diag...
WARNING (theano.tensor.blas): We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.
Sequential sampling (1 chains in 1 job)
NUTS: [hours, educ, age2, age, Intercept]
  0%|          | 0/3000 [00:00<?, ?it/s]
...
ValueError: Bad initial energy: inf. The model might be misspecified.

How can I get the demo to run?

Try trace_logistic_model = pm.sample(2000, chains=4, tune=1000, init='adapt_diag'). The default random initialization is not robust in some cases.

Thank you! This works, but the sampling seems to proceed a bit slow at 3/s. This is with GPU, on CPU it’s more like 1-2 s/draw (and I’m still getting the blas warning). Not sure if I’m missing any settings? Metropolis was sampling at 15-20/s. Still running the logistic reg with some 30K observations. I’ve added the output of np.show_config() below.

Auto-assigning NUTS sampler... 
Initializing NUTS using adapt_diag... 
Sequential sampling (2 chains in 1 job) 
NUTS: [educ, hours, I(age ** 2), age, sex[T. Male], Intercept] 
11%|█ | 1173/11000 [07:18&lt;46:54, 3.49it/s]
blas_mkl_info:
  NOT AVAILABLE
blis_info:
  NOT AVAILABLE
openblas_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
blas_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_mkl_info:
  NOT AVAILABLE
openblas_lapack_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]
lapack_opt_info:
    libraries = ['openblas', 'openblas']
    library_dirs = ['/usr/local/lib']
    language = c
    define_macros = [('HAVE_CBLAS', None)]

Ok, so standardizing the variables and using only 2 cores on CPU (uses 4-core machine with 8 threads) gets the sampling to 50/sec. Do you happen to know why selecting a single core uses 50% of 4 cores, maybe because the default is 4 threads?

Not something I am familiar with… sorry

What kind of BLAS warning you are having?

It’s in the first post:

WARNING (theano.tensor.blas): We did not find a dynamic library in the library_dir of the library we use for blas. If you use ATLAS, make sure to compile it with dynamics library.

FYI - there are environment variables for openblas/MKL to adjust the number of threads (a bit further down this discussion)