So I am struggling to understand what this distribution does well enough to both explain to my supervisor what it is and to explain why it is of value. So here is my understanding and please tell me where my reasoning is off:

-this creates a tensor of normal distributions with mean 0 and a standard deviation sigma except for n of those distributions having a sigma of zero

-what this allows one to do is run a model without having to drop any catagorical variables because the last variable will converge to zeroâ€”should there be a step where you add an intercept at the end of your covariate matrix if this is so?

-what I am scratching my head over is how this works and why this is different from just putting all your binary variables into a model and running that.