Nice notebook. What you are doing is correct. Sometimes (e.g. in brms which based on Stan), ppl use an indexing method for the random effects. Khatri-Rao with Kronecker product is more for when you have a random intercept and slope model, but you can construct by hand similarly - the aim is to create a sparse matrix Z where there are blocks of values on the diagonal.
In case you have not seen it, I have this repository doing mixed effect model with varies libraries and parameterization: https://github.com/junpenglao/GLMM-in-Python [EDIT] Just realized I already reply to you about this repo on a similar topic before 