Only now did I realize what the article model is about.
It’s modelling the change in Bayesian belief from trial to trial, assuming some latent dynamics that are being learned online. It’s not a conventional HMM. That’s why things move slowly (with some bursts of change) in those plots from the optlearner
class.
That’s also why k
is allowed to vary over trials, since it is the posterior belief at each trial that is being modeled and not k
itself which would otherwise be a fixed parameter over the course of the experiment.
To obtain equivalent results one would need to plot the posterior for the model that is fit with only the first trial, then the first two trials, first three trials, up to the last one, which includes all the trials. Or marginalize over these updates as is done in the paper and the optlearner
class.