How to use pytensor's scan function

mmurrow95 · February 27, 2023, 6:48pm

I’m confused about how to loop using pytensor’s scan function. What I am doing is building a package which uses PyMC in the background, but doesn’t have the user interact directly with the PyMC interface. As a result of this, I am left with a loop in my model callout which looks something like the following:

p = {0 : 'pm.Uniform("p0", lower = 0.0, upper = 1.0)',
     1 : 'pm.Uniform("p1", lower = 0.0, upper = 1.0)'}

p_pymc = [0]*2

with pm.Model() as test:
    
    for ii in range(2):
        p_pymc[ii] = eval(p[ii])
        
    p_pymc = pt.as_tensor_variable(p_pymc)

How do I go about replacing a loop of this kind with scan? Is is possible?

jessegrabowski · February 27, 2023, 7:06pm

The purpose of scan is to loop in cases where 1) you need to update and maintain a state that will be passed between steps of the loop, and 2) you want the gradient of the whole operation.

In this case it looks like you just want to loop over a dictionary of distributions and use them to dynamically construct a model? Here you can just use a normal python loop.

mmurrow95 · February 27, 2023, 7:10pm

The problem isn’t so much that I can’t, more that PyMC doesn’t like it if the loop gets too big. Say you’re using a few hundred priors instead. That large a loop throws an error when calling pm.Potential further down the line.

jessegrabowski · February 27, 2023, 7:17pm

Hundreds of priors shouldn’t be a problem at all for PyMC. From what you posted, you’re dynamically constructing a model, which scan can’t help you do.

You could probably get big speedups by fusing those priors together, for example writing a routine that checks that p[0] and p[1] are copies of the same distribution, and creates a single p[0] = 'pm.Uniform("p", lower=0.0, upper=1.0, size=(2,))'. Vectorizing priors like this is much more performant than creating hundreds of individuals.

mmurrow95 · February 27, 2023, 7:25pm

Maybe this will make the issue I’m having a little clearer. Here is a post I worked on last year for a bit that was throwing the error in questions.

When my loops become too large, I get a bracket nesting level error. Granted, in that old case, I was looping across a bunch of potentials as well, but I think the issue will remain the same.

Noted on the vectorizing as well, will be more aware of that going forwards.

jessegrabowski · February 27, 2023, 7:49pm

I still don’t see why you need a scan, or a loop for that matter. Couldn’t you make a big matrix, like:

mu = pm.Normal('mu_c0', mu = mean_mu, sigma = sd_mu, shape = N_subjects)
b = pm.Uniform("b", lower = 0.1, upper = 1.0)

zeros = pt.zeros((N_subjects, 1))
ones  = pt.ones((N_subjects, 1))

# This is just zero? 
log_ten= pt.log10(ones)

phi_pymc = pt.concatenate([mu[:, None], zeros, zeros, ones, b.repeat(N_subjects)[:, None], zeros, zeros, log_ten, log_ten, log_ten, ones, zeros, zeros, ones], axis=1)

Now phi_pymc is an (N_subjects, 14) matrix where each row is a subject. You can then do whatever you need to do inside your log-likelihood functions all in one shot by passing in the entire matrix?

Topic		Replies	Views
PyMC Compilation Nesting with pytensor.scan	12	91	September 7, 2024
Nested scan outer loop only iterates over first step v5 scan_ops , modeling	4	200	February 16, 2024
Using PyTensor.scan to loop over multidimensional array v5	13	1349	June 1, 2023
Using theano scan with pymc3 Questions	3	876	December 4, 2020
PyTensor scan of an array with PyMC sampling pytensor	7	429	March 29, 2024

How to use pytensor's scan function

Related topics