Clustered Error Modeling

That is close, but I’d like all the error terms to come from the same learned distribution. I think that using your model and my example of 10 templates and 100 instances, I would end up with 100 draws from 10 different learned distributions for my e(j), rather than 10 draws from one learned distribution that are reused each time the template is referenced.

Does that make sense? Is this possible?