Fitting a simple Reinforcement Learning model to behavioral data with PyMC (Jupyter NB)

HuangHam · July 30, 2022, 1:51pm

Weird. I tried running with 4 cores yesterday and the computer still crashed due to the memory issue. For my current purpose, I’ll just reduce the number of cores. I’d love to know if this problem is fixed in the future! As for the “other issue”, what I meant was not the difference in the time expectation, but the number of chains appeared to have changed. For example, if I use 2 cores or above, I see 29/16000, but if I set ncores=1, it’s only 11/4000 which doesn’t look right. And there is no “Sampling 4 chains” but instead “Sampling chain 0”. I don’t think the number of chains depends on the number of cores. I don’t know if it’s only printing the wrong message, or somehow it’s only sampling one chain when I set core=1 even though the chain parameter is 4.

ricardoV94 · July 31, 2022, 7:18am

I don’t remember what the output with multiple chains and 1 core should be. You can check with a simple fast model to see if there’s a bug indeed. It shouldn’t have anything to do with your complex model.

HuangHam · August 4, 2022, 3:28pm

I tried a simple example and I know what the “problem” is. The model I used was taken from one of the pymc online tutorials:

RANDOM_SEED = 8927
rng = np.random.default_rng(RANDOM_SEED)
az.style.use("arviz-darkgrid")
with pm.Model() as model:
    mu = pm.Normal("mu", mu=0, sigma=1)
    obs = pm.Normal("obs", mu=mu, sigma=1, observed=rng.standard_normal(100))

If I sample with 4 chains 2 cores:
idata = pm.sample(2000, chains=4, cores=2)
It works fine as it’s sampling 4 chains in total 12000 sample (8000 + 3000 tunning):
[12000/12000 00:06<00:00 Sampling 4 chains, 0 divergences]
However if I sample 4 chains with 1 core (idata = pm.sample(2000, chains=4, cores=1)) I get:
[3000/3000 00:03<00:00 Sampling chain 0, 0 divergences]
so it’s only sampling one chain (3000 samples). However, this is being done 4 times (corresponding to 4 chains). e.g. [3000/3000 00:01<00:00 Sampling chain 1, 0 divergences] In summary, if you only use one core, then pymc only starts sampling another chain in a separate progress bar after it finishes sampling the previous chain. But if you use chores > 1, pymc only displays one progress bar which lumps all chains together. So I think there isn’t a bug after all. It’s just displayed differently.

HuangHam · August 11, 2022, 3:54pm

I think now I have a good theory of why pm.DensityDist in my case required way more memory than pm.Potential. So to fit RL models, there are inputs of different types. For example, the reward may be a float but the action must be an integer. In your notebook example using pm.Potential, you first separated these inputs of different types before converting them into aesara tensor. However to use pm.DensityDist, I think it only takes one aesara object as the data input, which forces me to first convert all inputs of different types into one aesara tensor, and then later separate them and change them into the appropriate types. Maybe changing the type of aesara object requires additional memory.

ricardoV94 · August 11, 2022, 5:33pm

DensityDist can receive multiple inputs, you don’t need to merge everything.

Bue yes that’s a plausible explanation for the differences you found.

ricardoV94 · August 25, 2022, 8:48am

The original notebook is now an official pymc-example thanks to @juanitorduz

HuangHam · October 10, 2022, 9:58pm

How to have DensityDist take in multiple observed inputs? In pyMC3 it’s observed = a dictionary, but this doesn’t seem to work for pyMC4. I couldn’t find any example online.

ricardoV94 · April 28, 2023, 6:29pm

Sorry for the delay. You can’t take multiple observed values. Do they have the same length? You could concatenate everything in that case and split it inside the DensityDist

Topic		Replies	Views
Modeling reinforcement learning of human participant using PyMC3 Questions	8	3193	June 7, 2023
Reinforcement learning - help building a model Questions scan_ops , large_model	16	1346	June 3, 2022
Behrens' Bayesian Learner Model : How to share parameters between steps? Questions	44	2228	January 7, 2022
Failure to link with MKL_RT under Windows v5 aesara , installation	20	2411	October 5, 2022
Bringing the drift-diffusion model (DDM) to PyMC3 Development	50	4924	December 25, 2022

Fitting a simple Reinforcement Learning model to behavioral data with PyMC (Jupyter NB)

Related topics