How to control random seed for RVs in V 4.0?

marius · June 6, 2022, 10:00pm

Congrats on the new release! I’m eager to get my MUSE inference package working with the release. It had been working on 4.0.0b6, the only new thing in the release I can’t figure out is how to handle random seeding.

Essentially what my library needs is to compile some custom sampling functions for the RVs, as a simple example consider:

with pm.Model() as model:
    x = pm.Normal("x")
    y = pm.Normal("y")

sample_x_y = aesara.function([], model.basic_RVs)

then it needs to call sample_x_y and have control over the random seed (and fwiw the real thing does something slightly more complex so using eg sample_prior_predictive isn’t an option).

The 4.0.0b6 solution based on model.rng_seq which seems to have dissapeared looked like this, but it was hacky and slow, so I’m happy to be rid of it guessing that the new thing is better.

So what’s the appropriate way to do this now? Thanks.

ricardoV94 · June 7, 2022, 6:03am

You can call pymc.aesaraf.compile_pymc which accepts a random seed and will reseed your variables before compiling a aesara function, as well as set automatic updates so that values vary between calls.

Something like this (writing from memory):

from pymc.aesara import compile_pymc

# I think seed can be a seedsequence, but also just an integer
seed = np.random.SeedSequence(123)

sample_x_y = compile_pymc([], model.basic_RVs, random_seed=seed)

If you need to reseed the seeds between calls you can also do that (there are some utilities for that in the same aesaraf module) but that would slow you down.

marius · June 7, 2022, 6:24am

Thanks, hmm, yea that’s exactly what I need to do, and it looks like I can do that with those utilities, this seems to give me exactly it:

sample_x_y = aesara.function([], model.basic_RVs)
pymc.aesaraf.reseed_rngs(pymc.aesaraf.find_rng_nodes(model.basic_RVs), seed)
sample_x_y()

now I can control which samples I get with seed, without having to recompile everytime. Glancing at the code I think its kind of doing what my hacky thing was doing before anyway, but seems to work. Thanks!

ricardoV94 · June 7, 2022, 6:49am

You should probably cache the result of find_rngs in that case. We do something like this for model.intial_point here: pymc/initial_point.py at da1f63b95f64d02c958302ddc44ee6d8b838a39d · pymc-devs/pymc · GitHub

ricardoV94 · June 7, 2022, 6:52am

Out of curiosity, why do you need to reseed between calls? The values will still follow a deterministic sequence across calls.

marius · June 7, 2022, 7:13am

The algorithm is basically solving an equation like \langle f(x,\theta) \rangle_{x\sim\mathcal{P}(x\,|\,\theta)} = 0 for \theta where f is some function and the (Monte-Carlo-computed) average is over a bunch of samples of x from a likelihood \mathcal{P}(x\,|\,\theta). It really helps the convergence of the solver if these x are the same seeds at each iteratoin of the solver, since it makes the MC average vary smoothly with \theta, rather than having a new random MC error for each test value of \theta.

So basically you seed, generate a bunch of x given the current \theta, use these to compute the next \theta, reseed, generate same-seeded x's but with the new \theta, etc…

twiecki · June 7, 2022, 7:41am

I didn’t know muse existed – looks cool! Did you announce it here?

marius · June 7, 2022, 7:43am

Thanks! Probably in the next couple of days, just need to get these last tweaks to get it working on the final V4 release and some final API tweaks.

ricardoV94 · June 7, 2022, 7:45am

Hmm… sounds a bit funny. I am not sure what the properties of the new draws with new theta but same seed will be…

Btw if you are running this algorithm in parallel you will want to actually swap the rng variables so that reseeding them in one process will not reseed them in another.

In the intial point code I sent you that’s done some lines above.

twiecki · June 7, 2022, 7:49am

Did you try and integrate the JAX implementation of MUSE with a pymc model running on the new JAX backend?

ricardoV94 · June 7, 2022, 7:54am

I don’t think we support RandomVariables (updates) in the JAX backend

marius · June 7, 2022, 8:05am

No, but its vaguely on my todo list to look into it. If you glanced at the docs (which sounds like you did since you saw the Jax interface) you saw the PyMC version has some overhead, since the algorithm itself is pure Python, only the calls to various posteriors gradients and transformation are compiled with aesara. Is that what this might help with? Or are posterior gradients themselves just faster with Jax? (any resources you might point me to about this new backend for a library author would be helpful!)

twiecki · June 7, 2022, 1:21pm

If the algorithm itself is written in Python there won’t be any speed-improvements. Only if you implemented the algorithm in JAX (or better yet: Aesara directly like aehmc does GitHub - aesara-devs/aehmc: An experimental HMC implementation in Aesara) would you get these benefits.

marius · June 7, 2022, 6:17pm

Thanks, good to know. A JAX / JIT-able version is close, although no plans for an Aesara version. Its maybe not the highest priority since MUSE isn’t really for super cheap posteriors anyway. But how transparent is the “Jax backend” to PyMC. Like can I just put a jax.jit around some function which internally is calling into aesara.function-complied code and ? Or its not that simple? (sorry for the barrage of questions, no hurry)

twiecki · June 7, 2022, 7:30pm

Pretty much that simple, you can look at pymc/sampling_jax.py at main · pymc-devs/pymc · GitHub for some inspiration.

junpenglao · June 7, 2022, 8:51pm

Still recommend you to upstream MUSE to BlackJAX, then PyMC can use MUSE through BlackJAX directly.

ricardoV94 · June 8, 2022, 6:05am

I understand the excitement about JAX, but I would not recommend it for this algorithm, simply because we don’t have a way to sample from the prior of a PyMC model with the jax backend and this algorithm requires it.

marius · June 8, 2022, 6:11pm

Thanks, BlackJAX is definitely on the horizon still. I suppose based on what @ricardoV94 mentions that would mean automatic PyMC integration would then just work, but it could still be used in BlackJAX alone? Or does blackjax not specify an interface for problems to define how to sample from the prior, only to evaluate the posterior? That indeed would be a limitation for using MUSE in BlackJAX. (also sorry this question might be better suited for the BlackJAX repo, I can ask there, at least start looking around)

junpenglao · June 9, 2022, 5:05am

Oh I didint know that forward sampling is not working yet in JAX mode (sorry @ricardoV94 I missed your earlier reply), in that case the automatic PyMC integration wont just work as you will hit a bug in the JAX backend when doing prior sampling.

twiecki · June 9, 2022, 7:47am

I also didn’t know that, I thought that whole code was now aesara-fied? Should we open an issue?

Topic		Replies	Views
ADVI reproducibility v5 aesara	6	412	July 13, 2022
Non-deterministic results between different machines v5	8	904	May 11, 2022
Seeding issues when using model with Dirichlet and Gamma Questions	2	561	November 21, 2018
Aesara.scan() creating weird intermediate variables that crash the model v5 bug	4	501	September 26, 2022
Custom distributions in PyMc4 v5	1	858	June 13, 2022

How to control random seed for RVs in V 4.0?

Related topics