Mu of pm.Normal() depends on output of RandomVariable on previous time step. How do I implement this?

I have been trying, and failing, to implement a very simple trial-and-error-learning model in PyMC. The model predicts a distribution N(mu_t,sigma_t) at each time point ‘t’. It then samples from the distribution on that trial. If the sample is close to a pre-defined value ‘r’, that trial is rewarded, else no reward. If a trial is rewarded then mu_t+1 is moved towards the sampled value by some step-size parameter ‘alpha’, such that the predicted distribution in trial t+1 is centred closer to the sample on trial t. Sigma is also a function of the outcome on the previous trial, to encourage exploration after an unrewarded trial.

The error I get :
“ValueError: Random variables detected in the logp graph: {normal_rv{0, (0, 0), floatX, False}.out}.
This can happen when DensityDist logp or Interval transform functions reference nonlocal variables,
or when not all rvs have a corresponding value variable.”

My current code:

    def lik(self,rot,nSteps,obs):
        with pm.Model() as model:
            nSteps = pm.intX(pm.ConstantData("nSteps", nSteps))
            rot = T.as_tensor_variable(rot)
            executionNoise = pm.Uniform(name='executionNoise',lower=0.0,upper=100.0)
            exploratoryNoise = pm.Uniform(name='exploratoryNoise',lower=0.0,upper=10000.0)
            lr = pm.Uniform(name='lr',lower=0.0,upper=1.0)
            observations = pm.Data("observations", obs)
            #outputs_info = np.array([0.,0.,0.]).astype('float64')#desiredCurrMu,sigma,usedAim
            desiredCurr = np.array([0.]).astype('float64')[0]
            sigma = np.array([0.]).astype('float64')[0]
            usedAim = np.array([0.]).astype('float64')[0]
            outcome = np.array([0]).astype('int8')[0]
            rng = RandomStream(0)
            #rng = np.random.default_rng(0)
            def stepFunc(desiredCurr,sigma,usedAim,outcome,executionNoise,exploratoryNoise,lr):
                sig = pm.math.switch(
                    executionNoise + exploratoryNoise
                des = pm.math.switch(
                    desiredCurr + ((usedAim - desiredCurr) * lr) % 360,
                aim = pm.Normal.dist(mu=des,sigma=sig)
                #aim = rng.normal(des,sig)
                aim = pm.math.switch(
          , 180),
                    aim - 360,
                rew = pm.math.switch(
                    T.le(pm.math.abs(aim - rot), 3),
                return des,sig,aim,rew
            outputs_info,_ = pt.scan(fn=stepFunc,
            desiredCurrMu,sigma,usedAim,outcome = outputs_info
            states = pm.Deterministic("states", usedAim)
            sigmas = pm.Deterministic("sigmas", sigma)
            likelihood = pm.Normal("likelihood", mu=states, sigma=sigmas, observed=observations)   
            return model

If it’s useful, here is the original model, which is hopefully fairle self-explanatory:

How do I implement this model correctly in PyMC?

Many thanks in advance.

If you have random variables inside a scan, like pm.Normal.dist(...), you need to help pytensor pass along the random states from step to step. There’s a helper function for this in pymc.pytensorf called collect_default_updates. Here’s an example of how to do this.

Next you should make the whole scanned trajectory a single random variable. One thing that isn’t good about that example is that it does this model.register_rv thing to do that; this is not the recommended method. Use a pm.CustomDist, passing the scan output as the dist argument, as shown here.

One other thing to do is always pass strict=True to your scans. This will prevent scan from looking outside the local namespace of stepFunc when doing its thing, which can be a source of errors. I really recommend that you always explicitly pass everything to scan. It looks like you do that here, but it’s still good to have that check active.


Thanks a lot for your reply, Jesse. I’ve spent a lot of time trying to get this to work, but I can’t get CustomDist to take more than one output from the scan. I want to collect the mus and sigmas, much like in your first example, and used them for the standard log-likelihood function (pm.Normal) to compute the likelihood of the observations. I’ll try doing this using the model.register_rv method and post a solution here if I find one.

Edit: just wanted to emphasise that your response did at least help me to solve the problems that I was facing at the time, so thanks for that. Now I just need to solve the new set of problems…

So I seem to be unable to implement this with either method. I suppose that I must have big fundamental misunderstandings about what pymc is doing as it’s taken me several weeks to try to implement three models which would otherwise take an hour or so.

My problem is this: I don’t want CustomDist or model.register_rv to compute any logp function. I just want them to run the scan which stores mu and sigma values in tensors (e.g., T.set_subtensor(sigmaList[idx], sig,inplace=True), which I can then use after the scan to compute the likelihood of observations in pm.Normal(“likelihood”, mu=mus, sigma=sigmas, observed=observations).

If I try doing this using the CustomDist function then I am forced to pass the observations in at that point, such that I am not allowed to pass the observations into the likelihood function. model.register_rv also forces me to pass in observations, leading to the same problem, otherwise I still get the ‘random variable in logp graph’ error.

Is there some other function that will run the scan with a random variable in it, but will allow me to compute the likelihood function after the scan?

You are getting the logp in graph error because mu and sigma are random variables (because they depend on aim), but you are trying to treat them like deterministic computations. You’re not actually forced to pass observations into CustomDist, so I suppose you could concatenate mu and sigma together into a single output, then return that combined mu_sigma from your dist function. Then your observed variable could be pm.Normal(..., mu=mu_sigma[:, 0], sigma=mu_sigma[:, 1], observed=data).

That seems a bit needlessly complex though. Why is the sequence of aim (which is a normal parameterized by des and sig at each timestep) and giving it a sequence of observations is different from taking a sequence of des and `sig, using them to parameterize a sequence of normals, and instead giving that the data is different? My first instinct is that they should be completely equivalent, but I’m also worried about having to integrate over all the switches, and whether the “manual” approach will do that correctly. If you let the automatic logp inference handle it you don’t have to worry about this.

The thing is you can’t create PyMC variables inside a Scan. Your other alternative is to work purely on the logp side of your model. Assign Flat priors to all your variables outside of the scan, pass them as sequences and then at the end define a Potential with the expression that represents the whole model joint-probability.

It turns out Jesse was right - I can just use the normal parameterised by des and sig. I tried this ages ago but the posterior distributions indicated that it wasn’t working, and there were lots of divergences, so I assumed that it just wasn’t doing what I expected. It turns out I just needed to sample a lot more than I was.

Thanks to Jesse and Ricardo for your input!

Thusly, for anyone that might find this useful, my working code is:

def lik(self,rot,nSteps,obs):
    with pm.Model() as model:
        def step(*args):
            desired,sig,idx,executionNoise,exploratoryNoise,lr,rot,desList = args
            prevAim = pm.Normal.dist(mu=desired,sigma=sig)
            prevRew = T.switch(T.le(T.abs(prevAim - rot),3),1,0)
            des = T.switch(T.eq(prevRew,1),desired + ((prevAim - desired) * lr) % 360,desired)
            des = T.switch(,180),des -360,des)
            sig = T.switch(T.eq(prevRew,1),executionNoise,executionNoise+exploratoryNoise)
            aimUpdate = collect_default_updates(inputs=args,outputs=[des,sig,prevAim])
            T.set_subtensor(desList[idx], des, inplace=True)
            return (des,sig,prevAim,idx), aimUpdate 
        def generatePredictions(desiredCurr,sigma,idx,exectutionNoise,exploratoryNoise,lr,rot,desList,nSteps,size):
            outputs_info,_ = pt.scan(fn=step,
            desired,sig,sampledAims,_ = outputs_info
            return sampledAims
        nSteps = pm.intX(pm.ConstantData("nSteps", nSteps))
        rot = T.as_tensor_variable(rot)
        executionNoise = pm.Uniform(name='executionNoise',lower=0.01,upper=100.0)
        exploratoryNoise = pm.Uniform(name='exploratoryNoise',lower=0.0,upper=10000.0)
        lr = pm.Uniform(name='lr',lower=0.0,upper=1.0)
        observations = pm.Data("observations", obs)
        desiredCurr = np.array([0.]).astype('float64')[0]
        sigma = np.array([0.01]).astype('float64')[0]
        usedAim = np.array([0.]).astype('float64')[0]
        outcome = np.array([0]).astype('int8')[0]
        idx = np.array([0]).astype('int8')[0]
        desList = T.empty(nSteps)
        lik = pm.CustomDist("lik",desiredCurr,sigma,idx,executionNoise,exploratoryNoise,lr,rot,desList,nSteps,
                                    dist=generatePredictions,observed = observations)
        states = pm.Deterministic("states", desList)
        return model

So the posterior parameter distributions are now correct:
Notice in the code and the plot how I have attempted to record the state (the desired aim) of the model on each timestep, but clearly I’ve failed to do that correctly as when I plot the mean against the observations, I get:
or even this, for a different participant, which I do not understand at all:

Nonetheless, the parameter distributions are roughly what is expected, given the data, so I’ll just trust that the model is working as intended and hope that subsequent model comparisons with my other two models will indicate whether or not this is the case.

If anyone reading this does know a simple way to plot the state as I intend to, please let me know. If I just plot the result of my CustomDist (“lik”), then it perfectly matches the observations, so that’s also not what I want.

The states are nonsense because you have desList = T.empty(nSteps). It’s just whatever bytes happened to be at some memory locations when the array was created.

What are the states in your model, \bar \theta^{\text{aim}}_t? Since everything is normal in your model, you can take the mean across all your posterior predictive draws of lik to get that back. As an aside, I don’t like that convention of naming the observed variable “likelihood”, because it’s not really the likelihood term. It’s the distribution that generates the observed data, which is then used to compute the likelihood. Naming it “lik” obscures the fact that your predicted states will come from it.


Ah, thanks I understand now Jesse. Plotting the mean across all posterior predictive draws works well and looks as expected. WRT the naming convention; thanks for that, I agree - I just mindlessly copied it from other examples and never spent the time to fully understand what was going on (not that I didn’t try at all).

Thanks for all your help here.