Handling multiple cross-cutting (non-nested) groups in hierarchical model

lucianopaz · January 25, 2019, 9:16am

I’ll try to explain one of the reasons by comparing to what many machine learning algorithms call regularization. There are many situations in ML where there are a LOT of parameters. To prevent overfitting, and to help find the most relevant parameters and automatically discard the irrelevant ones, a regularization term is added to the training loss function. Some examples are the L1 and L2 regularizations that basically penalize big parameter values. This automatically let’s the fitting procedure move the irrelevant parameters to zero and only change them if they are truly important during training.

These added regularizations are the same as adding a Laplace or normal prior distribution instead of a flat prior. They automatically help in the regularization. The main difference between regularization and noon flat priors in my opinion is that in Bayesian inference you usually don’t look for the single most likely parameters (MAP) but sample from the full posterior distribution, so the results are different to standard ML.

Topic		Replies	Views
Hierarchical logistic regression giving non-sensible results? Do I formulate it correctly? Questions	15	921	August 4, 2020
Hierarchical regression models for ratings data ( 2 by 2 within-subject design) Questions	3	1803	December 12, 2019
Hierarchical MMM for intercept in pymc3 v5	1	80	February 18, 2025
Hierarchical linear model, estimated parameters the same for each group v3 modeling , hierarchical	5	644	September 8, 2022
Hierarchical linear model with two parent distributions Questions	16	1372	September 15, 2018

Handling multiple cross-cutting (non-nested) groups in hierarchical model

Related topics