Dealing with correlated variables

Hi
I am looking for suggestions for how to deal with correlated variables in a model. I have a physical model of the form y = A*(B^m), where A = a-bx+cx^2. Here my unknown parameters are (a,b,c,m). I am running into convergence issues with this model as a,b,c are highly correlated. This is causing the posterior to wonder into regions which give a good fit to the data but physically may not be reasonable (since some of these parameters have a physical meaning). What is the right way to deal with these kind of problems? How does one think about reparameterization in this scenario? The variables are all continuous & I am using the NUTS sampler.

1 Like

Have you tried setting init='adapt_full'? That adapts the entire mass matrix during tuning instead of just the diagonals and it made a significant improvement to my runs. However, it will take longer, depending on your model and length of data.

Something that might take a little bit more looking into but looks pretty useful (and will reduce run time of sampling) if there are only a few variables that are correlated: pmx-ext has implemented a way that allows you to group together parameters which have correlations and to perform full adaptation on those, while only adapting the diagonals of the variables which aren’t correlated.

2 Likes

This may be a silly suggestion so I would recommend doing some research on it before trying to implement, but you could use PCA to remove the correlations. In this case, PCA isn’t used as a dimensionality reduction method, but instead rotates the existing axes of your input space to align with the largest variations. I think you would first transform the input data with PCA, and then undo the transformation on the fit parameter values to get interpretable estimates out. Again, this may not be a good/real method, but I think it would remove the correlations between predictors.

Unless you’re using a correlated prior, the posterior distributions of a, b, and c are highly correlated on account of your data. If you have samples near x = 0 this should break the correlation quite nicely between a and the other variables. Re-parameterization may fix some problems with convergence, but it doesn’t significantly alter the posterior, and so should not impact the “unphysical” nature of the sampling – I would recommend addressing that kind of a constraint with a potential.

Also, have you experimented at all in the log domain \log y = \log(a - bx + cx^2) + m \log(B) ?