`jitter+adapt_diag` vs `adapt_diag`

Hi all,

I find that using jitter+adapt_diag can give some Mass matrix contains zeros on the diagonal errors. I have seen it suggested that you use adapt_diag, and indeed this prevents the errors. However, the run-time is far far slower.

Any general suggestions?

Sound like a model miss-specification problem you have there, since the Mass matrix contains zeros on the diagonal does not happen in the initial step - maybe try twisting your prior a bit?

Interesting. What do you mean by ‘twisting the prior’? Make the prior more informative? Its somewhat tricky because the model is a large heirarchical model, so it would be great if I knew which parameter/prior was causing the issues!

the error message should indicate which variable and which dimension the NaN gradient is coming from for you to investigate further. and yes more informative prior is what i meant.

1 Like