Constructing the random effects model matrix Z

I am trying to replicate the sleep deprivation study found in the R package LME4 (for fitting linear mixed effects models) and described in this paper. The dataset sleepstudy contains

the average reaction time per day for subjects in a sleep deprivation study. On day 0 the subjects had their normal amount of sleep. Starting that night they were restricted to 3 hours of sleep per night. The observations represent the average reaction time on a series of tests given each day to each subject.

What I want to to is to fit a random effects model on intercept and slope for each subject. And I would like to use the matrix parametrization. That is, X is the design matrix for the fixed effects given by the formula Reaction ~ 1 + Days, and Z is the design matrix for the random effects. However, I have some doubts on how to design such matrix Z to get intercept’s and slope’s coefficients.

What I did is to simply create a design matrix from the formula Reaction ~ 0 + Subject + Subject:Days. Is this the right way? On one hand the results I get seems reasonable (see notebook). On the other hand, in the paper describing LME4 (link) at page 9 they use a more complex method based on Khatri-Rao and Kronecker product. Am I missing something?

Nice notebook. What you are doing is correct. Sometimes (e.g. in brms which based on Stan), ppl use an indexing method for the random effects. Khatri-Rao with Kronecker product is more for when you have a random intercept and slope model, but you can construct by hand similarly - the aim is to create a sparse matrix Z where there are blocks of values on the diagonal.

In case you have not seen it, I have this repository doing mixed effect model with varies libraries and parameterization: https://github.com/junpenglao/GLMM-in-Python [EDIT] Just realized I already reply to you about this repo on a similar topic before :sweat_smile:

Just to get it straight. As I want to have a random effect on intercept and slope, either I construct the matrix Z as Reaction ~ 0 + Subject + Subject:Days (as I did), or I construct the same matrix with the fancy Khatri-Rao method. The two approaches should be equivalent right? (I will verify this myself later when I am back from work).

No problem :wink: But it seems you do not have an example with random effects on slopes, or do you?

No I do not.

What you are doing in your notebook is correct. Also, I agree with your approach separating the random intercept and slope instead of multiplying a big matrix - you have much more controls on the prior.

Speaking on which. I made a small mistake and I set the same sigma_Z for both intercept and slope of the random effect. Now, instead, I declared sigma_Z_intercept and sigma_Z_slope. The predictions are way better. Before the slope was not allow to vary, hence the prediction for subject 309 (for example) was off. I updated the notebook. I should clean it up and post it somewhere (sometime :slight_smile: )

2 Likes