Plate notation: shape parameter/for loop

Hi all - thanks for making such a powerful and easy to use tool!

I’m trying to build this model from the following paper: https://sites.google.com/site/falklieder/LiederGoodmanHuys2013.pdf

I’m having difficulty coding up the plate notation for the transition probabilities in particular. There is a separate theta vector (of length s) for each state-action pair drawn from a dirichlet and the transition probabilities are drawn from a categorical distribution (the paper says multinomial, but means multinomial with n=1).

For the transition probabilities, I would ideally have a 3d array with one of the dimensions summing up to one. The documentation for the multinomial states that p can only be a one or two dimensional array, and the documentation for the categorical states that p can be an array, but that the elements are normalised to sum up to one. I’m therefore not sure these would be appropriate.

I have tried to implement in a for loop instead:

import theano as th
import numpy as np
import numpy.random as nr
import pymc3 as pm

nr.seed(0)

N_initstates = 2 # number of initial states
N_actions = 5 
N_futurestates = 2 # number of future states
#Hyperpriors for sigma squared
alphasigma=5
betasigma=10
#Hyperpriors for c
muc=2
varc=4

a=np.array([[1]])
mod = pm.Model()

with mod:
    sigmacroot=pm.InverseGamma('sigmacroot',alphasigma,betasigma)
    sigmasqc=np.square(sigmacroot)
    c=pm.Normal('c',muc,varc)
    alphadraft=pm.Normal('alphadraft',c,sigmasqc,shape=N_initstates)
    alpha=pm.math.exp(-alphadraft)     
    beta=pm.Dirichlet('beta',np.repeat(a,N_initstates),shape=(N_actions,N_initstates))
    dirichletparam=alpha*beta
    
    for currentstate in range(N_initstates):
        theta = [pm.Dirichlet('theta'+str(currentstate), dirichletparam, shape=(N_actions,N_futurestates), testval=None)]
        Transition = [pm.Categorical('transition'+str(currentstate), theta[0], shape=(N_futurestates,N_actions))]

with mod:
    # draw posterior samples
    step=pm.Metropolis()
    numchains=3
    trace = pm.sample(draws=10000,step=step, njobs=numchains)
    
_ = pm.traceplot(trace)
if numchains>1:
    pm.gelman_rubin(trace)
pm.summary(trace)
pm.plot_posterior(trace)

However, when running this, I get a long Theano error stating that IndexError: index 2 is out of bounds for axis 1 with size 2 and have not been able to trace the cause. I think this also relates to a couple of posts on Github (? #535), but after scouring them have not yet managed to solve this.

Any advice on how to vectorise this model with shape or fix the “for loop” would be much appreciated.

Many thanks again.

I have not look into your model carefully, so below is just some quick comments:

  1. I dont think the for-loop is correct here.

Because at the end of the for-loop it will just return the last theta = [] and Transition = []. As a quick fix you can try:

from theano import tensor as tt
with mod:
    ...
    theta = []
    Transition = []
    for i in range(N_initstates):
        theta.append(pm.Dirichlet('theta'+str(i), dirichletparam, shape=(N_actions,N_futurestates), testval=None))
        Transition.append(pm.Categorical('transition'+str(i), theta[0], shape=(N_futurestates,N_actions)))
    theta=tt.stack(theta)
    Transition=tt.stack(Transition)

Also, are you sure it’s theta[0] but not theta[i] above?

You can try to create a flatten two dimensional Transition and reshape it into 3D.

1 Like

Many thanks for the comments - I’ll fix the for loop and try flattening - good idea.
edit: This seems to have removed the error - merci!