Recovering variance from multinomial softmax models

@ricardoV94 I tried with 3 groups and the situation did not look that different. And as I had other more urgent things to do I had to put this aside for a week or so.

Probably good that I did. Coming back fresh to it, I think I can now make sense of it all.

The key thing is that what we should expect to be roughly the same for all these models is the variance of -centered- ws just as it is going into the softmax (as softmax does not care if a constant is added to all the values).

This much was obvious, hence my attempts at centering between ws and ws2. But I forgot that w and ws_offset can come out strongly corellated to each other (which is what pair plots show and I guess @ricardoV94 was trying to point me to) and that will also change final variances because V(X+Y) = V(X) + V(Y) + 2 Cov(X,Y)

So to answer my own question:

Because intercept can covary with the dependent variable effects, if you care about variances, you should consider them together i.e. look at the variance of (intercept + effect) as that factors in their covariance.

I’m guessing same logic should work for multiple dependent variables, i.e. if the model for log-likelihood of result category c is y_kc = i_c + v_kc + w_kc (where v_kc and w_kc are the effects of first and second independent variable on k-th respondent for result category c) then meaningful variances can be read off (i_c + v_kc) and (i_c + w_kc)

When I have some time, I’ll try to implement this logic for my bigger models and see if it gives me what I’m expecting.

1 Like