Sampling with while loop issue - Initialization always the same / resetting variable

I’m trying to implement the simple probabilistic programs from one of the white-papers of probabilistic programming. I ran into an issue for the cases including a while loop which will run forever, see both Example 3 and 4 from page two (fig two) in the paper above. The specific code below is for Example 3.

I am not using any underlying data(set), but solely interested in the kind of distributions that will be produced, and look into seeing patterns in the running time (and yes, this particular example could easily be calculated in R).

So here is what I tried:

# both theano functions are probably not needed, but did it for debugging
@theano.compile.ops.as_op(itypes=[tt.lscalar,tt.lscalar], otypes=[tt.lscalar])
def c_addition(c1,c2):
    print(c1,c2,c1+c2)
    return c1+c2

@theano.compile.ops.as_op(itypes=[tt.lscalar], otypes=[tt.lscalar])
def c_above_0(count):
    return count < 1

with pm.Model() as model:
    c1 = pm.Bernoulli('c1',p=0.5)
    c2 = pm.Bernoulli('c2',p=0.5)
    count = c_addition(c1,c2) # could just run c1+c2, but doing this explictly for debugging
    z = pm.Deterministic('z', c1+c2)
    iterr = 1
    while c_above_0(count): # should be the same as `while count < 1:`
        count = 0
        c1 = pm.Bernoulli(f'c1{iterr}',p=0.5)
        c2 = pm.Bernoulli(f'c2{iterr}',p=0.5)
        count = c_addition(c1,c2)
        z = pm.Deterministic(f'z{iterr}', c1+c2)
        iterr +=1
    step1 = pm.Metropolis([c1,c2,z])
    trace2 = pm.sample(samples,return_inferencedata=0,step= step1)

Firstly, sorry if this is ugly code. So first thing, is that it gave errors when there were variables with the same name, so I had to use an f-string syntax. I believe this somehow ties together with the below, as I could imagine one shouldn’t initialize the variable several times.

Second (and actual issue), when it came to sampling it will just run forever, because the while loop condition will evaluate to true every time, because (I think) both c1 and c2 will start at 0 each time, and is reset for every while loop call. This is at least what it looked like with the print statement.

So the question is: How would I fix this?

You cannot use a while loop with PyMC variables. c1 + c2 returns a symbolic expression that always evaluates to True, that is bool(c1 + c2) == True. Theano can do while loops with scan, but unfortunately those won’t work with PyMC Variables, until at least the next major release (PyMC v4).

You were indeed right that there was something wrong with my condition. I have fixed this now. With that fixed, is there no kinda hacky way to go about the rest with the while loop?
Fix done for the condition checker: (also actually gave it a name that matches the functionality)

def c_below_1(count):
    return count.tag.test_value < 1

I’m thinking that maybe one could change the starting value of the Bernoulli inside of the loop to be a random number, instead of always being 0. Is that possible? (Just in the sense of changing the start value. I could imagine that there are no known ways to use while loops based on your first reply).

There are no ways to have PyMC distributions inside scan loops in PyMC3, this will be possible in the next major release with PyMC V4.

I have a prototype notebook here, but as you can see many things are still messy and need to be cleaned up: pymc_aeppl_scan_model.ipynb · GitHub

What you are doing in your example is wrong with using test values. That loop will only be evaluated once when you create the variables and not be rerun or updated during sampling.

That loop will only be evaluated once when you create the variables and not be rerun or updated during sampling.

Okay, so this disappointed me a lot. Because now it looks like that I cannot even implement the if-statement sections correctly. There is no way to fix this either?
An example of regular python, and pymc3 giving different results:

Pymc3

with pm.Model() as model: # discreteuniform
    i = pm.Bernoulli('i',p=0.3)
    d = pm.Bernoulli('d',p=0.4)
    if i.tag.test_value == 0 and d.tag.test_value == 0:
        g = pm.Bernoulli('g',p=0.7)
    elif i.tag.test_value == 0 and d.tag.test_value == 1:
        g = pm.Bernoulli('g',p=0.95)
    elif i.tag.test_value == 1 and d.tag.test_value == 0:
        g = pm.Bernoulli('g',p=0.1)
    else:
        g = pm.Bernoulli('g',p=0.5)

    if i.tag.test_value == 0:
        s = pm.Bernoulli('s',p=0.05)
    else:
        s = pm.Bernoulli('s',p=0.8)

    if g.tag.test_value == 0:
        l = pm.Bernoulli('l',p=0.1)
    else:
        l = pm.Bernoulli('l',p=0.6)

    z = pm.Deterministic(f'z', l)
    step1 = pm.Metropolis([i,d,g,s,l,z])
    trace2 = pm.sample(samples,return_inferencedata=0,step= step1)

python3, random library

from random import randint,uniform

samples = 1000000

si = 0
sd = 0
sg = 0
ss = 0
sl = 0

for _ in range(samples):
    i = 1 if uniform(0,1) <= 0.3 else 0
    d = 1 if uniform(0,1) <= 0.4 else 0

    if not (i and d):
        g = 1 if uniform(0,1) <= 0.7 else 0
    elif (not i) and d:
        g = 1 if uniform(0,1) <= 0.95 else 0
    elif i and (not d):
        g = 1 if uniform(0,1) <= 0.1 else 0
    else:
        g = 1 if uniform(0,1) <= 0.5 else 0

    if not i:
        s = 1 if uniform(0,1) <= 0.05 else 0
    else:
        s = 1 if uniform(0,1) <= 0.8 else 0

    if not g:
        l = 1 if uniform(0,1) <= 0.1 else 0
    else:
        l = 1 if uniform(0,1) <= 0.6 else 0

    si += i
    sd += d
    sg += g
    ss += s
    sl += l

print(si/samples,sd/samples,sg/samples,ss/samples,sl/samples)

You can do if/else logic but you can’t use the usual Python syntax. PyMC builds symbolic Theno graphs that are lazily evaluated (and which can be automatically differentiated).

Theano offers equivalent symbolic syntax to if/else and while/for loops, via theano.tensor.switch (works like numpy.where) and theano.scan, respectively.

This guide might give you some better understanding of this relationship: PyMC3 and Theano — PyMC3 3.11.5 documentation

You can also see some examples in our documentation, as well as Theano’s:

https://theano-pymc.readthedocs.io/en/latest/introduction.html

https://theano-pymc.readthedocs.io/en/latest/tutorial/index.html

You can use PyMC distributions with both of those (switch and scan) but they must be defined outside of the function calls.