ValueError: Random variables detected in the logp graph - what does it mean?

ec_ai · March 10, 2024, 7:41pm

Dear all,
I have the following model:


exp_rewards = np.random.rand(300) < 0.75

T = 10
with pm.Model() as model:
    k = pm.Normal("k", -2.3, 0.1)
    K = pm.Deterministic("K", pt.exp(k))
    vi = pm.Normal.dist(mu=-2.7, sigma=K)
    ri = pm.Normal.dist(mu=0, sigma=np.exp(vi))
    v = pm.GaussianRandomWalk('v', init_dist=vi, mu=0, sigma=K, steps=T-1)
    V = pm.Deterministic("V", np.exp(-v))
    rw = pm.GaussianRandomWalk('rw', init_dist=ri, mu=0, sigma=V[:-1], steps=T-1, shape=T)
    r = pm.Deterministic("r", pt.math.sigmoid(rw))
    y = pm.Bernoulli(f"y", r, observed=exp_rewards[:T])

    trace = pm.sample(chains=4)

(The goal is actually to reproduce this model, but it would require a “BetaRandomWalk” for rw)

But it is giving me the following error:

ValueError: Random variables detected in the logp graph: {normal_rv{0, (0, 0), floatX, False}.out}.
This can happen when DensityDist logp or Interval transform functions reference nonlocal variables,
or when not all rvs have a corresponding value variable.

By commenting/uncommeting lines, it seems rw declaration is intrudocing the problem, but I don’t understand what the error actually means.
What is problem?

By reading similar threads, such as:

I guess the problem is something like “GaussianRandomWalk of rw requires init_dist (ri) to be an unregistered (unnamed) distribution but this in turn depends on some named variable” (like k, which must be named as it is a variable we want to keep track of in the model) - so in the end there is an unnamed distribution between two named variables and pymc doesn’t like it.
On the other hand, if I comment the line after V, the model runs just fine, so the GaussianRandomWalk of v doesn’t give the same problem…
I am confused…

ricardoV94 · March 11, 2024, 5:29am

I think the problem is here:

ri can’t depend on another unnamed dist. Note that neither vi nor ri are ever used in the model other than to inform PyMC what kind of distribution the initial point has.

You probably want to defind ri as such:

ri = pm.Normal.dist(mu=0, sigma=np.exp(v[0]))

There’s a pm.RandomWalk that accepts arbitrary innovation distributions, but I doubt you want a straightforward Beta random walk where innovations are just Betas, as that will be something that will drift outside of [0, 1].

There was a model sometime ago, where each step was a posterior update, and if your model is similar it still isn’t easy to implement in PyMC: Behrens' Bayesian Learner Model : How to share parameters between steps? - #45 by ricardoV94

If on the other hand your model is just a simple sequence of interrelated points generated from betas (as opposed to interraleted full-blown distributions) you could implement it with a CustomDist with a Scan these days (just make sure to pass a logodds transform): PyMC_timeseries_AR.ipynb · GitHub

ec_ai · March 11, 2024, 4:24pm

Thanks @ricardoV94 !

After fixing ri using your suggestion, the model now correctly runs.
No divergences and a few minutes of sampling for T=300 and 1000+1000 samples.

There was a model sometime ago, where each step was a posterior update, and if your model is similar it still isn’t easy to implement in PyMC: Behrens’ Bayesian Learner Model : How to share parameters between steps? - #45 by ricardoV94

Yes, it is exactly the same model!!
I didn’t expect to find it here
I’ll have a look at the thread and see if I can implement also the “Beta random walk” as reported in the paper.

I had actually implemented the original model in PyMC using a naive approach with a for loop for both the v and r variable but the result was too slow. With T=300, creating 300 different v and 300 different r resulted in a “build time” (the time between the pm.sample invocation and when the sample process actually start) to become larger than 48 hours… a bit too long
It was also complaining a lot about divergences (but I hadn’t done parameter tuning, so maybe that was the problem).

Topic		Replies	Views
ValueError: Random variables detected in the logp graph v5	7	664	July 13, 2023
Can pm.draw() be used inside a pymc model or will it break links between RVs and observed/likelihood variable? v5	5	569	May 8, 2023
Mu of pm.Normal() depends on output of RandomVariable on previous time step. How do I implement this? v5 time_series , modeling	8	732	July 26, 2023
GaussianRandomWalk error 4.0.0b5 v5 time_series	9	1130	April 20, 2022
GaussianRandomWalk from the example notebook errors out with ValueError: Input dimension mis-match Questions	2	626	December 8, 2020

ValueError: Random variables detected in the logp graph - what does it mean?

Related topics