I did find one mistake in the code. But even with this correction, the estimates (especially for the correlations) are off (the first rho should be -0.3 ish not positive 0.4). I can get “correct” estimates using brms or rstanarm so I must be doing something wrong in terms of the pymc3 code?
id_idx=wages.new_id.values #array of the group IDs
N_ids=len(wages.id.unique()) # number of unqiue group IDs
dim_corr=3 #dimension of the correlation matrix (int, slope for exper and slope for hgc.9)
with pm.Model() as m_wages:
#distribution LKJCholeskyCov pulls the needed number of standard deviations
#LKJCholeskyCov will extract as many as needed
sd_dist = pm.HalfCauchy.dist(beta=2)
#eta is the parameter of how informative the prior is (figure 13.3 of rethinking) for the correlation
# n is the dimension of the covariance matrix (here 3 x 3)
packed_chol = pm.LKJCholeskyCov('chol_cov', eta=1, n=dim_corr, sd_dist=sd_dist)
#n The number of rows of the triangular matrix - corresponds to dimension of the corr matrix above
#lower : If true, assume that the matrix is lower triangular.
chol = pm.expand_packed_triangular(n=dim_corr, packed=packed_chol, lower=True)
#The cholesky decomposition only includes L where S=LL.T so this gets to the full covariance matrix
cov = tt.dot(chol, chol.T)
# Extract the standard deviations and rhos#################################################################
##########################################################################################################
#The diagonal of the covariance matrix are the variances
sigma_id = pm.Deterministic('sigma_id', tt.sqrt(tt.diag(cov)))
#these are the variance components (intercept and the 2 slopes)
variance_components=pm.Deterministic('variance_components',tt.power(sigma_id,2))
#This converts to correlation matrix
corr = tt.diag(sigma_id**-1).dot(cov.dot(tt.diag(sigma_id**-1)))
# the '3' (dim_corr) here says how many indicies to pull out
r = pm.Deterministic('Rho', corr[np.triu_indices(dim_corr, k=1)])
#can use chol or covariance matrix directly according to docstring
#This is the distribution of random effects : intercepts and slopes for all the observations in the data
ab_id = pm.MvNormal('ab_id', mu=0, chol=chol, shape=(N_ids, dim_corr)) # Population of varying effects
# Shape needs to be (N_ids, 3) because we're getting back both a and b for each group
#the grand intercept and the 2 other fixed effects
# these are assumed independent from the random effects
fixedEffects = pm.Normal('fixedEffects', 0., 10., shape=3)
#linear model
#order here will dictate order for covariance matrix
'''
intercept exper hgc.9
intercept
exper
hgc.9
'''
A = fixedEffects[0] + ab_id[id_idx, 0] #this is PIE_0_i in Singer/Willett book (fixed intercept + random intercepts)
BE = fixedEffects[1] + ab_id[id_idx, 1] #this is PIE_1_i
BH = fixedEffects[2] + ab_id[id_idx, 2] #this is PIE_2_i
mu = A + BE * wages['exper'].values +BH * wages['hgc.9'].values # linear model
sd = pm.HalfCauchy('sigma', beta=2) # prior stddev within ids (the error term sigma_e in Singer / Willet)
resid_var=pm.Deterministic('resid_var',tt.power(sd,2))
lnw = pm.Normal('lnw', mu=mu, sd=sd, observed=wages['lnw']) # likelihood
trace_wages = pm.sample(10000, tune=10000)
