Kernel is now dying with on a model that ran before

jordan.howell2 · May 30, 2022, 10:38am

Hello. I’ve been developing a model that has successfully sampled multiple times in a google cloud platform environment. As of four days ago, it quit sampling successfully to be replaced with a message that the “kernel expectantly died.”

I thought this may be because I changed my observed distribution from Normal to Poisson but it did the same thing when switching back to normal.

I’m not getting any new warnings. The only warning I’ve gotten was the
/opt/conda/lib/python3.7/site-packages/pymc/aesaraf.py:1010: UserWarning: The parameter 'updates' of aesara.function() expects an OrderedDict, got <class 'dict'>. Using a standard dictionary here results in non-deterministic behavior. You should use an OrderedDict if you are using Python 2.7 (collections.OrderedDict for older python), or use a list of (shared, update) pairs. Do not just convert your dictionary to this type before the call as the conversion will still be non-deterministic.

I took the code in the notebook and ran it as a script through the console and had this new message pop up:

  **kwargs,
Segmentation fault

I’m not sure if that has something to do with it. Has anyone ran into this problem before?

I’m running in a Linux environment.

PyMC Version: 4.0.0b6
Aesara Version: 2.5.1
Arvize Verions: 0.12.1

cluhmann · June 1, 2022, 3:17am

How did you install PyMC? Your version of aesara doesn’t match the requirements for PyMC v4, so something may have gotten messed up when you installed PyMC and/or created your environment.

jordan.howell2 · June 2, 2022, 1:15am

I installed by typing

pip install pymc --pre

I’ll update tomorrow morning. Thank you.

jordan.howell2 · June 2, 2022, 10:24am

Just tried to upgrade aesara and got the following error:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. pymc 4.0.0b6 requires aesara==2.5.1, but you have aesara 2.6.2 which is incompatible.

DanhPhan · June 2, 2022, 11:11am

Hi, you can try install the latest pymc version (along with latest aesara and aeppl) from github repos by using this command:

pip install git+https://github.com/pymc-devs/pymc.git

cluhmann · June 2, 2022, 1:20pm

I would also follow the instructions found here from scratch rather than try to fix a broken instillation by upgrading.

jordan.howell2 · June 2, 2022, 1:43pm

Thank you. I did follow those instructions first and that’s what gave me the older version of aesara.

jordan.howell2 · June 3, 2022, 2:20pm

Alright…even after I reinstalled and verified the following. After going back and recleaning/sampling my data to a smaller subset, I finally tried to rerun my model and the kernel still died.

I uninstalled everything and installed using this page Installation Guide (Linux) · pymc-devs/pymc Wiki (github.com) that @cluhmann suggested. Using the commands pip install pymc --pre installs aesara version 2.6.2.

When I upgraded aesara manually to version 2.6.6 as stated in the requirements page reference above, 2.6.2 is installed.

When I install the latest from GIT, the right versions seem to be installed, but the kernel still dies. Is there a way to flash any errors I may not be seeing?

When I run as a script through the command line terminal, I just see, “segmentation fault” appear.

cluhmann · June 4, 2022, 4:23pm

The kernel stopping is probably due to the segmentation fault you’re seeing in the terminal. Can you share the full message that is generated when it faults?

jordan.howell2 · June 4, 2022, 8:32pm

That’s the entire message it shows….segmentation fault. Is there a setting I can set for it to print out the entire message? I was also told offline of the discourse that it may be a shape issue. I need to check that.

cluhmann · June 4, 2022, 9:36pm

Sorry, I intended to ask about whatever output precedes the fault message. That might provide some indication about where along the way the fault is occurring. Is it during model construction? Is it during sampling? Etc.

jordan.howell2 · June 4, 2022, 10:00pm

Sampling

Jordan

cluhmann · June 4, 2022, 10:26pm

Does sampling begin? How far does it get? Does it always fail at the same place during sampling or does it vary each time you run it?

jordan.howell2 · June 5, 2022, 1:16am

Ahh! My apologies. The sampling begins and shows “compiling” on the screen. That’s as far as it gets before it dies.

Jordan

jordan.howell2 · June 7, 2022, 12:40pm

Update:

I’ve updated aesara. I’m now working with:
PyMC Version: 4.0.0b6
Aesara Version: 2.7.1
Arvize Verions: 0.12.1

I am experimenting on a simple model:

with pm.Model() as model:
    
    a = pm.Normal('base_sales', 0, 1)
    
    likelihood = pm.ZeroInflatedPoisson('y_hat',
                           mu    = a,
                           psi = .01,
                           observed = y)

    trace =  pymc.sampling_jax.sample_numpyro_nuts(tune=1000, draws = 2000)

This now gets past the sampling before the kernel dies.

This is the last message I get before the kernel dies.

Sampling time =  0:03:05.404196
Transforming variables...
Transformation time =  0:00:00.009645

When I change the same model to sample using trace = pm.sample(tune=2000, draws = 1000),
the kernel did not die but gave the following warning:

Sampling 4 chains for 1_000 tune and 2_000 draw iterations (4_000 + 8_000 draws total) took 633 seconds.
The acceptance probability does not match the target. It is 0.9035, but should be close to 0.8. Try to increase the number of tuning steps.

I’m not sure what acceptance probability means but could that have something to do with the kernel dieing when JAX is used?

When I increase the tuning size with Jax, it didn’t die. Is this a coincidence or does it make sense that acceptance probability (whatever that is) is what is causing the kernel to die?

Topic		Replies	Views
Aesaraf warning OrderedDict v5 aesara	1	682	March 31, 2022
My model crashed under pymc v4 v5	8	855	June 7, 2022
Aesara stopped working under new version of vscode/xcode v5 development	3	564	October 14, 2022
Diagnosing Pymc v4 slow sampling - linux, kubernetes notebook aesara	2	569	July 19, 2022
PYMC sample error v5 modeling	8	322	October 11, 2022

Kernel is now dying with on a model that ran before

Related topics