How to use pytensor's scan function

I’m confused about how to loop using pytensor’s scan function. What I am doing is building a package which uses PyMC in the background, but doesn’t have the user interact directly with the PyMC interface. As a result of this, I am left with a loop in my model callout which looks something like the following:

p = {0 : 'pm.Uniform("p0", lower = 0.0, upper = 1.0)',
     1 : 'pm.Uniform("p1", lower = 0.0, upper = 1.0)'}

p_pymc = [0]*2

with pm.Model() as test:
    
    for ii in range(2):
        p_pymc[ii] = eval(p[ii])
        
    p_pymc = pt.as_tensor_variable(p_pymc)

How do I go about replacing a loop of this kind with scan? Is is possible?

The purpose of scan is to loop in cases where 1) you need to update and maintain a state that will be passed between steps of the loop, and 2) you want the gradient of the whole operation.

In this case it looks like you just want to loop over a dictionary of distributions and use them to dynamically construct a model? Here you can just use a normal python loop.

The problem isn’t so much that I can’t, more that PyMC doesn’t like it if the loop gets too big. Say you’re using a few hundred priors instead. That large a loop throws an error when calling pm.Potential further down the line.

Hundreds of priors shouldn’t be a problem at all for PyMC. From what you posted, you’re dynamically constructing a model, which scan can’t help you do.

You could probably get big speedups by fusing those priors together, for example writing a routine that checks that p[0] and p[1] are copies of the same distribution, and creates a single p[0] = 'pm.Uniform("p", lower=0.0, upper=1.0, size=(2,))'. Vectorizing priors like this is much more performant than creating hundreds of individuals.

Maybe this will make the issue I’m having a little clearer. Here is a post I worked on last year for a bit that was throwing the error in questions.

When my loops become too large, I get a bracket nesting level error. Granted, in that old case, I was looping across a bunch of potentials as well, but I think the issue will remain the same.

Noted on the vectorizing as well, will be more aware of that going forwards.

1 Like

I still don’t see why you need a scan, or a loop for that matter. Couldn’t you make a big matrix, like:

mu = pm.Normal('mu_c0', mu = mean_mu, sigma = sd_mu, shape = N_subjects)
b = pm.Uniform("b", lower = 0.1, upper = 1.0)

zeros = pt.zeros((N_subjects, 1))
ones  = pt.ones((N_subjects, 1))

# This is just zero? 
log_ten= pt.log10(ones)

phi_pymc = pt.concatenate([mu[:, None], zeros, zeros, ones, b.repeat(N_subjects)[:, None], zeros, zeros, log_ten, log_ten, log_ten, ones, zeros, zeros, ones], axis=1)

Now phi_pymc is an (N_subjects, 14) matrix where each row is a subject. You can then do whatever you need to do inside your log-likelihood functions all in one shot by passing in the entire matrix?

1 Like