Running tests locally


#1

I’m trying to run the tests locally to check whether there are issues with a pull request of mine. Trying to run the tests, I get two collection errors:

___________________________________________________ ERROR collecting pymc3/tests/test_dist_math.py ___________________________________________________
ImportError while importing test module '/Users/rpg/src/pymc3/pymc3/tests/test_dist_math.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
test_dist_math.py:5: in <module>
    import theano.tests.unittest_tools as utt
/usr/local/lib/python3.7/site-packages/theano/tests/unittest_tools.py:7: in <module>
    from parameterized import parameterized
E   ModuleNotFoundError: No module named 'parameterized'
_____________________________________________________ ERROR collecting pymc3/tests/test_math.py ______________________________________________________
ImportError while importing test module '/Users/rpg/src/pymc3/pymc3/tests/test_math.py'.
Hint: make sure your test modules/packages have valid Python names.
Traceback:
test_math.py:5: in <module>
    from theano.tests import unittest_tools as utt
/usr/local/lib/python3.7/site-packages/theano/tests/unittest_tools.py:7: in <module>
    from parameterized import parameterized
E   ModuleNotFoundError: No module named 'parameterized'
====================================================================== FAILURES ======================================================================
___________________________________________________ TestGelmanRubin.test_right_shape_python_float ____________________________________________________

These are in parts of the codebase that I have not modified, so I’m puzzled about them. Is this by any chance a sign that I have misconfigured my development directory?


#2

Hi!

There’s some details about running tests for PRs on the CONTRIBUTING.md page on GitHub.

However, it looks like you just need to pip install parameterized though!


#3

Oh, thanks! I guess i should have seen that, but it looked like something deep in theano, so I figured it was just something bad going on inside my theano installation. Looks like it needs both parameterized and nose to test correctly.
I’m still getting some failures; I will report back when I have a full run.


#4

Yeah, there’s some theano testing utility that uses nose, which is unfortunate because pytest now has pleasant handling for parameterized tests.

The full test suite takes a while to run. I will normally use

python -m pytest -xv --cov=pymc3 --cov-report=html pymc3/
  • The python -m makes sure I am using the pytest from my current virtual environment
  • -x exits on the first test failure
  • -v is verbose, and prints the name of every test
  • --cov says what files to run coverage on (all of them)
  • --cov-report=html says to create a html output (open htmlcov/index.html after running this for a handy website!)

#5

Thanks for that advice. The last time I ran, I got 91 test failures, which wasn’t what I expected, since I haven’t interfered with any actual code!
I’ll collect a report and post it.


#6

I got 91 failures running as follows:

 pytest -ra --pastebin=all -v pymc3/

and I have a report available here: https://bpaste.net/show/6b5596fa2773.
TL;DR:

pymc3/tests/test_diagnostics.py::TestGelmanRubin::test_right_shape_python_float FAILED                                                   [  0%]
pymc3/tests/test_diagnostics.py::TestGelmanRubin::test_right_shape_scalar_tuple FAILED                                                   [  0%]
pymc3/tests/test_diagnostics.py::TestGelmanRubin::test_right_shape_tensor FAILED                                                         [  0%]
pymc3/tests/test_diagnostics.py::TestGelmanRubin::test_right_shape_scalar_array FAILED                                                   [  0%]
pymc3/tests/test_diagnostics.py::TestGelmanRubin::test_right_shape_scalar_one FAILED                                                     [  0%]
pymc3/tests/test_diagnostics.py::TestDiagnostics::test_geweke_negative PASSED                                                            [  0%]
pymc3/tests/test_diagnostics.py::TestDiagnostics::test_geweke_positive PASSED                                                            [  0%]
pymc3/tests/test_diagnostics.py::TestDiagnostics::test_effective_n FAILED                                                                [  0%]
pymc3/tests/test_diagnostics.py::TestDiagnostics::test_effective_n_right_shape_python_float FAILED                                       [  0%]
pymc3/tests/test_diagnostics.py::TestDiagnostics::test_effective_n_right_shape_scalar_tuple FAILED                                       [  0%]
pymc3/tests/test_diagnostics.py::TestDiagnostics::test_effective_n_right_shape_tensor FAILED                                             [  0%]
pymc3/tests/test_diagnostics.py::TestDiagnostics::test_effective_n_right_shape_scalar_array FAILED                                       [  0%]
pymc3/tests/test_diagnostics.py::TestDiagnostics::test_effective_n_right_shape_scalar_one FAILED 
pymc3/tests/test_distributions_timeseries.py::test_GARCH11 FAILED                                                                        [ 38%]
pymc3/tests/test_minibatches.py::TestGenerator::test_gen_cloning_with_shape_change FAILED                                                [ 50%]
pymc3/tests/test_model_func.py::test_dlogp2 FAILED                                                                                       [ 52%]
pymc3/tests/test_model_graph.py::TestSimpleModel::test_graphviz FAILED                                                                   [ 52%]
pymc3/tests/test_plots.py::test_plots FAILED                                                                                             [ 61%]
pymc3/tests/test_plots.py::test_plots_categorical FAILED                                                                                 [ 61%]
pymc3/tests/test_plots.py::test_plots_multidimensional FAILED 
pymc3/tests/test_sampling.py::TestSample::test_sample_init FAILED                                                                        [ 65%]
pymc3/tests/test_sampling.py::test_exec_nuts_init[map] FAILED                                                                            [ 66%]
ymc3/tests/test_sgfs.py::test_minibatch FAILED   
pymc3/tests/test_step.py::TestStepMethods::test_sample_exact FAILED                                                                      [ 73%]
pymc3/tests/test_step.py::TestNutsCheckTrace::test_bad_init FAILED                                                                       [ 74%]
pymc3/tests/test_transforms.py::test_simplex_bounds FAILED                                                                               [ 79%]
pymc3/tests/test_transforms.py::test_sum_to_1 FAILED                                                                                     [ 79%]
pymc3/tests/test_transforms.py::test_log FAILED                                                                                          [ 79%]
pymc3/tests/test_transforms.py::test_log_exp_m1 FAILED                                                                                   [ 79%]
pymc3/tests/test_transforms.py::test_logodds FAILED                                                                                      [ 79%]
pymc3/tests/test_transforms.py::test_lowerbound FAILED                                                                                   [ 79%]
pymc3/tests/test_transforms.py::test_upperbound FAILED                                                                                   [ 80%]
pymc3/tests/test_transforms.py::test_interval FAILED                                                                                     [ 80%]
pymc3/tests/test_transforms.py::test_circular FAILED                                                                                     [ 80%]
pymc3/tests/test_transforms.py::test_ordered FAILED                                                                                      [ 80%]
pymc3/tests/test_transforms.py::test_chain FAILED         
pymc3/tests/test_tuning.py::test_guess_scaling FAILED                                                                                    [ 81%]
pymc3/tests/test_variational_inference.py::test_sample_aevb[FullRankGroup : {}] FAILED                                                   [ 92%]
pymc3/tests/test_variational_inference.py::test_replacements_in_sample_node_aevb[MeanFieldGroup : {}] FAILED                             [ 92%]
pymc3/tests/test_variational_inference.py::test_replacements_in_sample_node_aevb[FullRankGroup : {}] FAILED                              [ 92%]
pymc3/tests/test_variational_inference.py::test_replacements_in_sample_node_aevb[NormalizingFlowGroup : {'flow': 'scale'}] FAILED        [ 92%]
pymc3/tests/test_variational_inference.py::test_replacements_in_sample_node_aevb[NormalizingFlowGroup : {'flow': 'loc'}] FAILED          [ 92%]
pymc3/tests/test_variational_inference.py::test_replacements_in_sample_node_aevb[NormalizingFlowGroup : {'flow': 'hh'}] FAILED           [ 92%]
pymc3/tests/test_variational_inference.py::test_replacements_in_sample_node_aevb[NormalizingFlowGroup : {'flow': 'planar'}] FAILED       [ 92%]
pymc3/tests/test_variational_inference.py::test_replacements_in_sample_node_aevb[NormalizingFlowGroup : {'flow': 'radial'}] FAILED       [ 92%]
pymc3/tests/test_variational_inference.py::test_replacements_in_sample_node_aevb[NormalizingFlowGroup : {'flow': 'radial-loc'}] FAILED   [ 93%]
ymc3/tests/test_variational_inference.py::test_elbo FAILED                                                                              [ 95%]
pymc3/tests/test_variational_inference.py::test_scale_cost_to_minibatch_works[2] FAILED                                                  [ 95%]
pymc3/tests/test_variational_inference.py::test_scale_cost_to_minibatch_works[5] FAILED                                                  [ 95%]
pymc3/tests/test_variational_inference.py::test_scale_cost_to_minibatch_works[8] FAILED                                                  [ 96%]
pymc3/tests/test_variational_inference.py::test_elbo_beta_kl[2] FAILED                                                                   [ 96%]
pymc3/tests/test_variational_inference.py::test_elbo_beta_kl[5] FAILED                                                                   [ 96%]
pymc3/tests/test_variational_inference.py::test_elbo_beta_kl[8] FAILED                                                                   [ 96%]
pymc3/tests/test_variational_inference.py::test_fit_oo[NFVI=scale-loc-mini] PASSED                                                       [ 96%]
pymc3/tests/test_variational_inference.py::test_profile[NFVI=scale-loc-mini] PASSED                                                      [ 96%]
pymc3/tests/test_variational_inference.py::test_fit_oo[NFVI=scale-loc-full] PASSED                                                       [ 96%]
pymc3/tests/test_variational_inference.py::test_profile[NFVI=scale-loc-full] PASSED                                                      [ 96%]
pymc3/tests/test_variational_inference.py::test_fit_oo[ADVI-full] PASSED                                                                 [ 96%]
pymc3/tests/test_variational_inference.py::test_profile[ADVI-full] PASSED                                                                [ 96%]
pymc3/tests/test_variational_inference.py::test_fit_oo[ADVI-mini] PASSED                                                                 [ 96%]
pymc3/tests/test_variational_inference.py::test_profile[ADVI-mini] PASSED                                                                [ 96%]
pymc3/tests/test_variational_inference.py::test_aevb[ADVI] FAILED                                                                        [ 96%]
pymc3/tests/test_variational_inference.py::test_replacements[ADVI] PASSED                                                                [ 96%]
pymc3/tests/test_variational_inference.py::test_sample_replacements[ADVI] FAILED                                                         [ 96%]
pymc3/tests/test_variational_inference.py::test_fit_oo[FullRankADVI-full] PASSED                                                         [ 96%]
pymc3/tests/test_variational_inference.py::test_profile[FullRankADVI-full] PASSED                                                        [ 96%]
pymc3/tests/test_variational_inference.py::test_fit_oo[FullRankADVI-mini] PASSED                                                         [ 96%]
pymc3/tests/test_variational_inference.py::test_profile[FullRankADVI-mini] PASSED                                                        [ 96%]
pymc3/tests/test_variational_inference.py::test_aevb[FullRankADVI] FAILED                                                                [ 96%]
pymc3/tests/test_variational_inference.py::test_replacements[FullRankADVI] PASSED                                                        [ 96%]
pymc3/tests/test_variational_inference.py::test_sample_replacements[FullRankADVI] FAILED                                                 [ 96%]
pymc3/tests/test_variational_inference.py::test_fit_oo[SVGD-full] FAILED                                                                 [ 96%]
pymc3/tests/test_variational_inference.py::test_profile[SVGD-full] FAILED                                                                [ 97%]
pymc3/tests/test_variational_inference.py::test_fit_oo[SVGD-mini] FAILED                                                                 [ 97%]
pymc3/tests/test_variational_inference.py::test_profile[SVGD-mini] FAILED                                                                [ 97%]
pymc3/tests/test_variational_inference.py::test_aevb[SVGD] SKIPPED                                                                       [ 97%]
pymc3/tests/test_variational_inference.py::test_replacements[SVGD] PASSED                                                                [ 97%]
pymc3/tests/test_variational_inference.py::test_sample_replacements[SVGD] FAILED                                                         [ 97%]
pymc3/tests/test_variational_inference.py::test_fit_oo[ASVGD-full] FAILED                                                                [ 97%]
pymc3/tests/test_variational_inference.py::test_profile[ASVGD-full] FAILED                                                               [ 97%]
pymc3/tests/test_variational_inference.py::test_fit_oo[ASVGD-mini] FAILED                                                                [ 97%]
pymc3/tests/test_variational_inference.py::test_profile[ASVGD-mini] FAILED                                                               [ 97%]
pymc3/tests/test_variational_inference.py::test_aevb[ASVGD] FAILED                                                                       [ 97%]
pymc3/tests/test_variational_inference.py::test_replacements[ASVGD] PASSED                                                               [ 97%]
pymc3/tests/test_variational_inference.py::test_sample_replacements[ASVGD] FAILED                                                        [ 97%]
pymc3/tests/test_variational_inference.py::test_aevb[NFVI=scale-loc] FAILED                                                              [ 97%]
pymc3/tests/test_variational_inference.py::test_replacements[NFVI=scale-loc] PASSED                                                      [ 97%]
pymc3/tests/test_variational_inference.py::test_sample_replacements[NFVI=scale-loc] FAILED   

There were a boatload of other failures in variational inference, but you get the idea.
So you can see that I don’t have a great idea about how to use the tests to evaluate what I have been doing…


#7

Some of them are easy to fix:

E           ImportError: This function requires the python library graphviz, along with binaries. The easiest way to install all of this is by running
E           
E           	conda install -c conda-forge python-graphviz

The rest of them is related to

E                   ImportError

/usr/local/lib/python3.7/site-packages/theano/scan_module/scan_perform_ext.py:63: ImportError

Seems theano is trying to import scan_perform to compile some function and fail.

And then there are a few ValueError: Bad initial energy: inf. The model might be misspecified. I think those test we should try to improve.


#8

With respect to the theano import error, I don’t really understand theano at all, much less its interaction with the C compiler. Is this a test that fails because theano is assuming that it can do C compilation on the fly?


#9

It seems theano is trying to import something and fail - I never seen it personally, maybe try https://github.com/Theano/Theano/issues/5564?


#10

I looked at the item about theano. They suggest wiping out the theano cache (which I did) and installing theano from github (which I did not do). The import error persists for me.
For those who are able to run these tests successfully, any idea where scan_perform comes from? Also, are you all installing Theano from source? I have Theano 1.0.2, which is the lastest version, according to pip (I don’t use conda).
The vast majority of my test failures (including the plots test failures) seem to involve this same issue: https://bpaste.net/show/d70aac484cb5


#11

It looks like changes to C internals broke theano scan on python 3.7, which means PyMC3 won’t work for now on Python 3.7 :frowning: (https://github.com/Theano/Theano/issues/6626). Is it possible to try to run your code with python 3.6?


#12

Would it be appropriate to make PyMC3 error out on Python 3.7 then, until this can be fixed? The error messages definitely don’t tell a clear story about what’s wrong.


#13

That helped a lot, but for what it’s worth, my copy of pytest (with Python 3.6) does not seem to support the --cov options. Is this a cause for concern?


#14

Oops, need pip install pytest-cov. Note that pip install -r requirements-dev.txt will get you a bunch of extra libraries helpful for development (including pytest-cov)


#15

Thanks. I did that pip install, and got this, which might be another Python 3 issue:

Ignoring mock: markers 'python_version < "3.4"' don't match your environment

#16

After downgrading to Python 3.6, things are much better: down to 2 failures:

I think the graphviz test is just a library problem having to do with me forgetting to reinstall the library when I downgraded to 3.6
Lines 2805 on in the (extremely slow to load) paste show test_bad_init results, which look like there may be a more real problem there. I’m a bit confused, because I would expect an error to be raised when we are testing a bad initial value. IIUC, the value error we expected did get signaled, but then we also see a chain failure error, which the test did not expect.