Using scan to loop across differenet length of vectors

Continuing the latest thread, I thought this can also be a general question (here’s the latest discussion: https://discourse.pymc.io/t/reinforcement-learning-help-building-a-model/8442

In short, I build an RL model, that receives a vector of stimulus and response and calculates expected value per trial.
Originally, I had a similar trial number to all subjects, so I was able to implement a matrix into the scan loop.
The original scan loop can be reached in the original post.
My problem:
Some subjects have different trial numbers, so I am unable to generate a matrix of trials X subjects to send into the scan loop.

To make the problem as clear as possible.
Lets say I had 10 trials per subject and 5 subjects. I have built a 10 X 5 matrix and fed it to the scan loop.
Currently, some subjects have 10 trials, but others have 9.

How can I solve this issue? maybe running two consecutive scan loops?

Is it reasonable to split into multiple scans (assuming you only have a handful of different matrices sizes)? A scan for the 10x trials, and another for the 9x trials

Otherwise you could perhaps pad the shorter sequences with zeros (or a value that does not affect the computation with whatever you do afterwards)

1 Like

Thanks for the reply Ricardo.
Is it reasonable to use multiple scans while under the same hierarchical model?
I was thinking of using NaNs, but wasn’t sure it would work properly

Depends on what you are doing exactly

Eventually, I try to estimate the learning rate (latent variable) based on stimuli presentation, electric shocks, and the outcome measure (skin conductance).
Let’s say I have 3 sets of trial lengths (69, 67, 65). I build a matrix for each, then how do I loop through them to send to the specific scan loop? via switch?

Since the set dimensions are known in advance you can just use python if statements, and then concatenate the learning rates from the 3 scans for whatever you need downstream.

Thanks for the help.
Lets assume this is the original model:

with pm.Model() as m:

alpha = pm.Beta('alpha', 1,1, shape=n_subj)
beta = pm.Normal('beta',0, 1, shape=n_subj)
eps = pm.HalfNormal('eps', 5)

Qs = 0.5 * tt.ones((n_subj,2), dtype='float64') # set values for boths stimuli (CS+, CS-)
vec = 0.5 * tt.ones((n_subj,1), dtype='float64') # vector to save the relevant stimulus's expactation

[Qs,vec, pe], updates = theano.scan(
    fn=update_Q,
    sequences=[stim, shock],
    outputs_info=[Qs, vec, None],
    non_sequences=[alpha, n_subj])


vec_ = vec[trials, subj,0] * beta[subj]

scrs = pm.Normal('scrs', mu = vec_, sd = eps, observed=scrMat) 

# add matrix of expected values (trials X subjects)
ev = pm.Deterministic('expected_value', vec_)
# add PE
pe = pm.Deterministic('pe', pe)

tr = pm.sample(target_accept=.9, chains=4, cores=10, return_inferencedata=True)

the function receives matrices of stim and shock (ntrials X subjects).
Later on, I use a similar structure for the scrMat (observed variable to fit, SCR with ntrials X nSubs as well).

So, as long as the matrices have the same size, I can send each of the three options of matrices to the same scan, just at different times. Next, I need to fit everything with the likelihood function, I can also split it into three? wouldn’t it be as if a just run each set of subjects separately?

It is the same whether you split the likelihood in 3 terms or join the 3 _vec and then pass them to a single likelihood.

Are you asking of the results will be the same as when you fit the model 3 times, each one with a subset of data?

yes. Just want to make sure such a model is not as if I ran three different models.

If there are shared hyperparameters it’s not the same as running individually. If there aren’t shared hyperparameters it wouldn’t make a difference to run together or separated anyway.

1 Like

OK. I have ran the following code:

with pm.Model() as m5:

# α
phi = pm.Uniform("phi", lower=0.0, upper=1.0)
kappa_log = pm.Exponential("kappa_log", lam=1.5)
kappa = pm.Deterministic("kappa", tt.exp(kappa_log))
alpha = pm.Beta("alpha", alpha=phi * kappa, beta=(1.0 - phi) * kappa, shape=n_subj)
alpha2 = pm.Beta("alpha2", alpha=phi * kappa, beta=(1.0 - phi) * kappa, shape=n_subj2)

# β
beta_h = pm.Normal('beta_h', 0,1)
beta_sd = pm.HalfNormal('beta_sd', 5)
beta = pm.Normal('beta',beta_h, beta_sd, shape=n_subj)
beta2 = pm.Normal('beta2',beta_h, beta_sd, shape=n_subj2)
   
eps = pm.HalfNormal('eps', 5, shape=2)



# first data (69 trials)
Qs = 0.5 * tt.ones((n_subj,2), dtype='float64') # set values for boths stimuli (CS+, CS-)
vec = 0.5 * tt.ones((n_subj,1), dtype='float64') # vector to save the relevant stimulus's expactation

[Qs,vec, pe], updates = theano.scan(
    fn=update_Q,
    sequences=[stim, shock],
    outputs_info=[Qs, vec, None],
    non_sequences=[alpha, n_subj])

 
vec_ = vec[trials, subj,0] * beta[subj]
# add matrix of expected values (trials X subjects)
ev = pm.Deterministic('expected_value', vec_)
# add PE
pe = pm.Deterministic('pe', pe)

scrs = pm.Normal('scrs', vec_, eps[0], observed=scrMat) 

# second data (65 trials)
Qs2 = 0.5 * tt.ones((n_subj2,2), dtype='float64') # set values for boths stimuli (CS+, CS-)
vec2 = 0.5 * tt.ones((n_subj2,1), dtype='float64') # vector to save the relevant stimulus's expactation

[Qs2,vec2, pe2], updates = theano.scan(
    fn=update_Q,
    sequences=[stim2, shock2],
    outputs_info=[Qs2, vec2, None],
    non_sequences=[alpha2, n_subj2])

 
vec2_ = vec[trials2, subj2,0] * beta[subj2]
# add matrix of expected values (trials X subjects)
ev2 = pm.Deterministic('expected_value2', vec2_)
# add PE
pe2 = pm.Deterministic('pe2', pe2)

scrs2 = pm.Normal('scrs2', vec2_, eps[1], observed=scrMat2) 

trH_phi = pm.sample(target_accept=.9, chains=4, cores=10, return_inferencedata=True)

Basically, I have created two different matrices and generated two likelihoods functions, all under the same hyperparameters. The resulting trace plots don’t look so good. The alpha of the original dataset (89 subjects with 69 trials) looks ok. The one of the second dataset (8 subjects, 65 trials) looks flat (low 95%HDI is on zero, high almost 0.7).

Any ideas what am I doing wrong?

Code looks good to me at a glance.

What do you mean second alpha looks flat? You have a total of 97 alphas right? How are you plotting / checking them?

I have 86 under alpha1 / beta1 and another 8 under alpha2 / beta2.
Using plot_trace, it looks like this:

So, the alpha is relatively reasonable, with some with more data and others with less (hence, broader distribution). The alpha2 doesn’t look right.

Precluding a bug somewhere, perhaps your new data / trials are just not as informative as the old ones?

Have you tried running the model just with the second dataset to confirm the individual fits follow the same pattern?

Oh well, the issues with sending out code on the forum is that everyone can see the big errors you’ve made in the code :slight_smile:

I have noticed a “bug”.
On the second scan I output

vec2

, but I then use the following line:

vec2_ = vec[trials2, subj2,0] * beta[subj2]

Which explains why the model can’t find an ok alpha. Correcting that issue - everything seems to work well.
Thank you (again) for the great help you provide. It is almost impossible doing these things without this forums.

3 Likes