Modeling reinforcement learning of human participant using PyMC3

A few quick thought:

  1. you should use a softmax instead of doing T.repeat to to expand the weight to a right shape.
  2. the choice of priors are a bit odd to me - is there any reason that you are using Uniform(0, 5) for betas?
  3. I think you want to model forget and alpha to be in [0, 1] right? The current way you are doing might not give you parameter that satisfy such constraint.
  4. maybe the theano.scan part could be rewrite into something doesnt need the scan?