This is a more general question about a modeling idea that I had, not strictly related to PyMC3:
I’d like to do outlier (or novelty) detection on a dataset using the posterior distribution of a Bayesian model that I fit to the data.
My idea is essentially: to use a conventional outlier detection method, I need a vector for each sample in my dataset that I can then feed into the outlier detection method. Now, suppose I have a dataset consisting of time series data that has high intrinsic variability. Therefore I can’t use the data points directly as the vectors for the outlier detection.
My idea is therefore to first fit a Bayesian model to the data (e.g. some time series model) and then leverage the resulting posterior distributions to construct a feature vector for each data sample.
For instance, I could quantify the “otherness” of a data sample by computing the KL divergence of the posterior distributions to the other samples or to a group-level posterior distributions (in a hierarchical model).
Are there any references for such an approach?