Hi,
I’m trying to adapt the reinforcement learning model from the tutorial on the pymc website so that it can be used for multiple subjects at the same time.
My inputs are a outcomes
matrix of size [n_trials x n_subj x 4] (there are 4 armed-bandit in this specific setup) and the agents responses choices
of size [n_trials x n_subj].
def update_belief(choices, outcomes, belief, choice_probs,
n_subj, alpha, beta):
# choices [n_trials x n_subj]
# outcomes [n_trials x n_subj x 4]
# belief [n_subj x 4]
# choice_probs [n_subj x 1]
c = (choices) * 2 # chosen arm index [0,2]
nc = (1 - choices) * 2 # not chosen arm index [0,2]
all_subjs = pt.arange(n_subj)
choice_probs = pt.set_subtensor(choice_probs[all_subjs,0],
pt.exp(beta[all_subjs] * (belief[all_subjs,c] - belief[all_subjs,c+1]))
/ (pt.exp(beta[all_subjs] * (belief[all_subjs,c] - belief[all_subjs,c+1]))
+ pt.exp(beta[all_subjs] * (belief[all_subjs,nc] - belief[all_subjs,nc+1]))))
# update beliefs of the chosen arm
belief = pt.set_subtensor(belief[all_subjs,c:c+2],
# Error here
# belief[n_subj x 2] + alpha[n_subj] * (outcomes[n_subj x 2] - belief[n_subj x 2])
belief[all_subjs,c:c+2] + alpha[all_subjs] * (outcomes[all_subjs,c:c+2] - belief[all_subjs,c:c+2]))
# do not update beliefs of the NOT chosen arm
belief = pt.set_subtensor(belief[all_subjs,nc:nc+2],
belief[all_subjs,nc:nc+2])
return belief, choice_probs
choices_ = pt.as_tensor_variable(choices, dtype='int32') # [n_trials x n_subj]
outcomes_ = pt.as_tensor_variable(outcomes, dtype='int32') # [n_trials x n_subj x 4]
lr = pt.scalar('lr')
bt = pt.scalar('bt')
alpha = lr * pt.ones(shape=n_subj)
beta = bt * pt.ones(shape=n_subj)
beliefs = 0.5 * pt.ones((n_subj,4), dtype='float64') # [n_subj x 4]
choice_probs_ = 0.5 * pt.ones((n_subj,1), dtype='float64') # [n_subj x 1]
[beliefs_pymc, choice_probs_pymc], updates = scan(
fn=update_belief,
sequences=[choices_, outcomes_],
non_sequences=[n_subj, alpha, beta],
outputs_info=[beliefs, choice_probs_]
)
pytensor_llik_td = pt.pytensor.function(
inputs=[lr, bt], outputs=[beliefs_pymc, choice_probs_pymc], on_unused_input="ignore"
)
When I try to run this code I get the following
Blockquote ValueError: Tensor of type Vector(int32, shape=(10,)) could not be cast to have 0 dimensions
where I do belief = pt.set_subtensor(...)
for the first time in update_belief(..)
.
Here 10 is the n_subj I’m using.
I don’t understand why the shapes are not matching, the subtensor I’m trying to set should have shape [n_subj x 2]
.
Also, is there a better way to fill a vector of length n_subj with with the inputs to the pytensor.function function?