Kernel is now dying with on a model that ran before

Hello. I’ve been developing a model that has successfully sampled multiple times in a google cloud platform environment. As of four days ago, it quit sampling successfully to be replaced with a message that the “kernel expectantly died.”

I thought this may be because I changed my observed distribution from Normal to Poisson but it did the same thing when switching back to normal.

I’m not getting any new warnings. The only warning I’ve gotten was the
/opt/conda/lib/python3.7/site-packages/pymc/aesaraf.py:1010: UserWarning: The parameter 'updates' of aesara.function() expects an OrderedDict, got <class 'dict'>. Using a standard dictionary here results in non-deterministic behavior. You should use an OrderedDict if you are using Python 2.7 (collections.OrderedDict for older python), or use a list of (shared, update) pairs. Do not just convert your dictionary to this type before the call as the conversion will still be non-deterministic.

I took the code in the notebook and ran it as a script through the console and had this new message pop up:

  **kwargs,
Segmentation fault

I’m not sure if that has something to do with it. Has anyone ran into this problem before?

I’m running in a Linux environment.

PyMC Version: 4.0.0b6
Aesara Version: 2.5.1
Arvize Verions: 0.12.1

1 Like

How did you install PyMC? Your version of aesara doesn’t match the requirements for PyMC v4, so something may have gotten messed up when you installed PyMC and/or created your environment.

I installed by typing

pip install pymc --pre

I’ll update tomorrow morning. Thank you.

1 Like

Just tried to upgrade aesara and got the following error:

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts. pymc 4.0.0b6 requires aesara==2.5.1, but you have aesara 2.6.2 which is incompatible.

Hi, you can try install the latest pymc version (along with latest aesara and aeppl) from github repos by using this command:

pip install git+https://github.com/pymc-devs/pymc.git

I would also follow the instructions found here from scratch rather than try to fix a broken instillation by upgrading.

Thank you. I did follow those instructions first and that’s what gave me the older version of aesara.

Alright…even after I reinstalled and verified the following. After going back and recleaning/sampling my data to a smaller subset, I finally tried to rerun my model and the kernel still died.

I uninstalled everything and installed using this page Installation Guide (Linux) · pymc-devs/pymc Wiki (github.com) that @cluhmann suggested. Using the commands pip install pymc --pre installs aesara version 2.6.2.

When I upgraded aesara manually to version 2.6.6 as stated in the requirements page reference above, 2.6.2 is installed.

When I install the latest from GIT, the right versions seem to be installed, but the kernel still dies. Is there a way to flash any errors I may not be seeing?

When I run as a script through the command line terminal, I just see, “segmentation fault” appear.

The kernel stopping is probably due to the segmentation fault you’re seeing in the terminal. Can you share the full message that is generated when it faults?

That’s the entire message it shows….segmentation fault. Is there a setting I can set for it to print out the entire message? I was also told offline of the discourse that it may be a shape issue. I need to check that.

Sorry, I intended to ask about whatever output precedes the fault message. That might provide some indication about where along the way the fault is occurring. Is it during model construction? Is it during sampling? Etc.

Sampling

Jordan

Does sampling begin? How far does it get? Does it always fail at the same place during sampling or does it vary each time you run it?

Ahh! My apologies. The sampling begins and shows “compiling” on the screen. That’s as far as it gets before it dies.

Jordan

Update:

I’ve updated aesara. I’m now working with:
PyMC Version: 4.0.0b6
Aesara Version: 2.7.1
Arvize Verions: 0.12.1

I am experimenting on a simple model:

with pm.Model() as model:
    
    a = pm.Normal('base_sales', 0, 1)
    
    likelihood = pm.ZeroInflatedPoisson('y_hat',
                           mu    = a,
                           psi = .01,
                           observed = y)

    trace =  pymc.sampling_jax.sample_numpyro_nuts(tune=1000, draws = 2000)

This now gets past the sampling before the kernel dies.

This is the last message I get before the kernel dies.

Sampling time =  0:03:05.404196
Transforming variables...
Transformation time =  0:00:00.009645

When I change the same model to sample using trace = pm.sample(tune=2000, draws = 1000),
the kernel did not die but gave the following warning:

Sampling 4 chains for 1_000 tune and 2_000 draw iterations (4_000 + 8_000 draws total) took 633 seconds.
The acceptance probability does not match the target. It is 0.9035, but should be close to 0.8. Try to increase the number of tuning steps.

I’m not sure what acceptance probability means but could that have something to do with the kernel dieing when JAX is used?

When I increase the tuning size with Jax, it didn’t die. Is this a coincidence or does it make sense that acceptance probability (whatever that is) is what is causing the kernel to die?

1 Like