I am looking for suggestions for how to deal with correlated variables in a model. I have a physical model of the form y = A*(B^m), where A = a-bx+cx^2. Here my unknown parameters are (a,b,c,m). I am running into convergence issues with this model as a,b,c are highly correlated. This is causing the posterior to wonder into regions which give a good fit to the data but physically may not be reasonable (since some of these parameters have a physical meaning). What is the right way to deal with these kind of problems? How does one think about reparameterization in this scenario? The variables are all continuous & I am using the NUTS sampler.
Have you tried setting
init='adapt_full'? That adapts the entire mass matrix during tuning instead of just the diagonals and it made a significant improvement to my runs. However, it will take longer, depending on your model and length of data.
Something that might take a little bit more looking into but looks pretty useful (and will reduce run time of sampling) if there are only a few variables that are correlated: pmx-ext has implemented a way that allows you to group together parameters which have correlations and to perform full adaptation on those, while only adapting the diagonals of the variables which aren’t correlated.
This may be a silly suggestion so I would recommend doing some research on it before trying to implement, but you could use PCA to remove the correlations. In this case, PCA isn’t used as a dimensionality reduction method, but instead rotates the existing axes of your input space to align with the largest variations. I think you would first transform the input data with PCA, and then undo the transformation on the fit parameter values to get interpretable estimates out. Again, this may not be a good/real method, but I think it would remove the correlations between predictors.
Unless you’re using a correlated prior, the posterior distributions of a, b, and c are highly correlated on account of your data. If you have samples near
x = 0 this should break the correlation quite nicely between
a and the other variables. Re-parameterization may fix some problems with convergence, but it doesn’t significantly alter the posterior, and so should not impact the “unphysical” nature of the sampling – I would recommend addressing that kind of a constraint with a potential.
Also, have you experimented at all in the log domain \log y = \log(a - bx + cx^2) + m \log(B) ?