Factor Analysis

If you want to put a prior on the number of factors, a (truncated) Dirichlet process prior is perhaps the easiest to express in PyMC3. You can see the docs for an example that you should be able to adapt to your use case.

1 Like

@AustinRochford, thanks for the suggestion.

I went through the example, and I’m trying to make sense of it.

As I understand, a Dirichlet process is a weighted sum of several different models.

If I have 10 stocks, I could have 1 factor, 2 factors…up to 10 factors. So in order to implement a Dirichlet process, I would need to simulate 55 factor returns (1 + 2 + … + 10). This seems expensive.

It seems much computationally better to just manually set the number of factors to 10, and hope the sd of the unnecessary factor returns would drift toward zero.

Conceptually a DP mixture can be thought of a weighted sum of different models, but you don’t have to simulate 55. It’s more properly thought of as a prior distribution on the number of mixture components that prefers fewer active components. You should still only have to simulate with at most 10 components in your hypothetical example.