Hi all,
I’m trying to implement some of the models from Farell and Lewandowsky (2018). I’m up to the last Bayesian hierarchical model example in Chapter 9, which describes a model of temporal discounting given the value and delay of options A and B. However I’m having some difficulties translating the nested for-loops in the JAGS code into PyMC3 code.
The Model
We start with a formula for the subjective discounting function:
V^B = B \times \frac{1}{1+kD}
Where V^B is the ‘present subjective value’ for the delayed reward B (e.g. an amount of money), and D is the delay of this reward (e.g. a time in months). The parameter k determines the steepness of the function.
Once the competing amounts are discounted according to this equation, the participant chooses between V^A and V^B according to:
P(\text{choose }V^B) = \Phi \Bigg( \frac{V^B - V^A}{\alpha} \Bigg)
Where \alpha determines the ‘sharpness’ of the decision, and \Phi is the normal CDF. This equation determines the probability of the observed choice (either option A or B, given the discounting).
We want to estimate \alpha and k parameters that are subject-specific, but are assigned (bounded) normal distribution hyperpriors. The V^A and V^B clearly depend upon both the size and delay of the reward.
The JAGS code
The JAGS code makes this model work by a nested for-loop, where the participant-level parameters are estimated, and then the probability of the choice for each trial (total number of trials denoted as T) has an estimated V^A and V^B.
model{
# Group-level hyperpriors
# k (steepness of hyperbolic discounting)
groupkmu ~ dnorm(0, 1/100)
groupksigma ~ dunif(0, 100)
# comparison acuity (alpha)
groupALPHAmu ~ dunif(0,5)
groupALPHAsigma ~ dunif(0,5)
# Participant-level parameters
for (p in 1:nsubj){
k[p] ~ dnorm(groupkmu, 1/(groupksigma^2)) T(0,)
alpha[p] ~ dnorm(groupALPHAmu, 1/(groupALPHAsigma^2)) T(0,)
for (t in 1:T) {
# calculate present subjective value for each reward
VA[p,t] <- A[p,t] / (1+k[p]*DA[p,t])
VB[p,t] <- B[p,t] / (1+k[p]*DB[p,t])
# Psychometric function yields predicted choice
P[p,t] <- phi( (VB[p,t]-VA[p,t]) / alpha[p] )
# Observed responses
R[p,t] ~ dbern(P[p,t])
}
}
}
My PyMC3 Code
To replace the for-loops, I’ve tried separating the data into n_{\text{participants}} by n_{\text{trials}} matrices, and using broadcasting on the estimated subject-specific a and k parameters to write the equations in the model as-is. The issues is that I’m getting ‘bad initial energy’ warnings and the model won’t run.
import numpy as np
import pymc3 as pm
# Import data
dat = pd.read_csv("https://raw.githubusercontent.com/psy-farrell/computational-modelling/master/codeFromBook/Chapter9/hierarchicalITC.dat", sep = "\t")
# Experiment info
n_subj = len(dat['subj'].unique())
n_trials = int(dat.shape[0]/n_subj)
# Convenience lambda to reshape the data
reshape = lambda var: dat[var].values.reshape((n_subj, n_trials))
# Extract variables in n * t matrices
A, DA = reshape('A'), reshape('DA')
B, DB = reshape('B'), reshape('DB')
R = reshape('R')
def Phi(x):
#'Cumulative distribution function for the standard normal distribution'
# A.k.a the probit transform
return(0.5 + 0.5 * pm.math.erf(x/pm.math.sqrt(2)))
with pm.Model() as Discounting:
# Priors for Hyperbolic Discounting Parameter K
k_mu = pm.Normal('K_mu', mu = 0, sigma = 10)
k_sigma = pm.Uniform('K_sigma', lower = 0, upper = 10)
# Priors for Comparison Acuity (alpha)
alpha_mu = pm.Uniform('Alpha_mu', lower = 0, upper = 5)
alpha_sigma = pm.Uniform('Alpha_sigma', lower = 0, upper = 5)
# Participant Parameters
BoundedNormal = pm.Bound(pm.Normal, lower = 0.0)
k = BoundedNormal('k', mu = k_mu, sigma = k_sigma, shape = n_subj)
alpha = BoundedNormal('alpha', mu = alpha_mu, sigma = alpha_sigma, shape = n_subj)
# Subjective Values
V_A = pm.Deterministic("V_A", A / (1 + k[:, None] * DA))
V_B = pm.Deterministic("V_B", B / (1 + k[:, None] * DB))
# Psychometric function
P = pm.Deterministic('P', Phi((V_B - V_A)/alpha[:, None]))
# Observed
h = pm.Bernoulli('h', p = P, observed = R)
# Sample
Discounting_trace = pm.sample(5000, init = 'adapt_diag', chains = 4)
Any thoughts on how I can properly estimate this?
Thanks!