Rookie Question On Combining PCA and Bayesian Inference

You have as many factors as columns of your data. You’re only taking 1 component from the PCA, so that’s 1, but if you add a constant that’s 2.

It’s important to realize that these priors are on the beta coefficients, not the data. You are assigning your prior belief about the effect size of a given time series on GDP. Assigning a Laplace prior to the effect sizes biases your model away from large effects, shrinking everything towards zero. The idea is that if you have many time series to include, you don’t need to do PCA at all. Instead, you can bias all your estimates towards zero, so only the truly important ones will make it through the filter, so to speak.

You can learn more about regularizing priors in this lecture of Statistical Rethinking. If you’re new to Bayesian stuff, I recommend this entire series to help you get familiar with how it all works. These lectures use R and Stan, but it’s all been ported to Python and PyMC here [EDIT: This link is outdated, see below]. You could also check out this StatQuest video about Ridge and Lasso regression if you’re not familiar with the idea of shrinkage for model selection (although the framework is not Bayesian, the ideas are the same).

I’m bringing all this up because it’s important to realize that PCA is just a linear transformation of the design matrix. In principle it doesn’t do anything more than any other linear model like OLS. People use it in frequentist frameworks to get around perfect collinearity, but Bayes offers you alternatives that I would argue are more principled (also there’s no need to invert matrices so perfect collinearity isn’t a problem). For example, why did you choose 1 component? Why not 2? For 100 series, why 20 and not 21? How are you doing these weighting adjustments? Why monthly adjustments? Etc etc. Designing a model do all this is much more defensible.

2 Likes