Looping in pymc3/theano

tpaixao · November 3, 2021, 3:37pm

Hi All,

I am trying to implement a model that involves looping but am having a hard time implementing this in pymc3.

The basic (unrolled loop) model would look something like this:

popsize=600
sample_size=100
with pm.Model() as model:
    
    p0 = pm.Beta("p0",1,1)
    
    p1 = pm.Binomial("n1",popsize,p0)/popsize
    p2 = pm.Binomial("n2",popsize,p1)/popsize
    
    r0 = pm.Binomial("r0",sample_size,p0,observed=[10,14,12])
    rT = pm.Binomial("rT",sample_size,p2,observed=[7,16,13])

Now, the question is how to loop the p1 , p2,… part so that I don’t have to explicitly write down all the binomial sampling steps (I obviously would like to have an arbitrary number of these)?

Thanks in advance for any help!

tiago

tpaixao · November 3, 2021, 4:26pm

Doing something like this:

popsize=600
sample_size=100
with pm.Model() as model:
    
    p0 = pm.Beta("p0",1,1,shape=1)
    
    def evolve(prev_p):
        return pm.Binomial.dist(popsize,prev_p).random()/popsize
    
    result,updates = theano.scan(evolve,
                                 sequences=[p0],
                                 n_steps=2)
    
    p2 = result[-1]
    
    r0 = pm.Binomial("r0",sample_size,p0,observed=[10,14,12])
    rT = pm.Binomial("rT",sample_size,p2,observed=[7,16,13])

results in ValueError: Cannot resolve inputs for ['p0[t]'].
Variations of this also result in errors…

jbuddy_13 · November 3, 2021, 4:34pm

Hi Tiago,
Could you add some more context? I might be able to help if I had a tangible use case in mind.

tpaixao · November 3, 2021, 4:58pm

Thanks @jbuddy_13.

The idea is to simulate a finite population evolving under genetic drift for several generations. You have a population of N individuals, of which initially n_0 have a certain gene (so, with frequency p_0 = n_0/N). Every new generation, a binomial sample of the previous generation is drawn, resulting in a new gene frequency p_{t} = n_{t}/N, where n_t \sim Bin(N,p_{t-1}).

the model that I presented in the original post is a simplification to illustrate my difficulty. The actual idea is to use information from sub-samples of the initial and final population to estimate parameters of a slightly more complicated model that I have not included here (namely, selection, which would change the way we calculate p_{t+1}.

Does this help?

tpaixao · November 3, 2021, 5:41pm

I figured out that the problem is this line:

def evolve(prev_p):
   return pm.Binomial.dist(popsize,prev_p).random()/popsize

In particular, theano does not like the use of pm.Binomial.dist()… A deterministic function works fine.

So the question is, How do i get random numbers inside a function that will be used by theano.scan() ?

twiecki · November 4, 2021, 6:45pm

I wouldn’t use scan for this, and generating random numbers is also not the right way.

I didn’t look too closely into what you want but I think this idea of generating a Gaussian random-walk might help:

pm.Normal('x', 0, sigma, shape=100).cumsum()

So you can just generate a vector that would be each initialization, and then e.g. sum the up to get random walk.

tpaixao · November 4, 2021, 10:50pm

thanks @twiecki. Although I am not sure this would solve it: What I would like is a series of binomial samples for which the p parameter of each draw depends on the previous draw. Because of this, I thought scan() would be right way of going about it.

How would you go about implementing this?

ricardoV94 · November 5, 2021, 7:25am

This is something that is not well supported in PyMC just yet. In the next major release it will be straightforward to use Scans to specify such kind of models.

tpaixao · November 8, 2021, 10:40am

Thanks @ricardoV94 . Looign forward to the new version. In any case, I ended up doing it without scan(), just creating a list of variables in a for loop. It’s probably much slower than the scan solution, as it ends up creating many variables that will be estimated, but it works.

I am leaving it here or reference, as it may help others:

ps =[ pm.Binomial("n0",popsize,p0_,shape=data.shape[0])/popsize ]

for n in range(1,24):
  nextp = ps[n-1]
  ps.append(pm.Binomial("n"+str(n),popsize,nextp,shape=data.shape[0])/popsize )

Importantly, you need to specify the shape, when you use Binomials like this…

ricardoV94 · November 8, 2021, 11:47am

The issue with that approach is that you can only do a very limited number of steps due to theano recursion limit. You can try to do that loop for 500 steps and you’ll see it (I think the limit is way less than 500).

But if that’s enough for you that’s great!

tpaixao · November 8, 2021, 12:06pm

for now it is enough, but I would certainly like to have an arbitrary number of steps.

Can you think of any other way with the current version of pymc3? Maybe using random streams?

Topic		Replies	Views
Using theano scan with pymc3 Questions	3	872	December 4, 2020
Write theano.scan+theano.function with multinomial sampling inside as a pymc3 model Questions theano	2	690	May 22, 2019
Implementing a more sophisticated pymc3 model v3 theano , modeling	8	780	June 11, 2022
Value error of pm.model Questions	8	597	May 28, 2019
Too complicated model? Theano cache: refreshing lock and slow sampling Questions	4	1318	June 12, 2020

Looping in pymc3/theano

Related topics