I’m trying to replicate an example from SAS in Python where I fit a distribution from summary statistics. The summary statistics available to me are the total count, min, max, std, p50, p75, p85, p95, p98, p99, and p99.9. The measurements are coming from a distributed network of machines and consist of either latency or size distributions. The goal is to re-construct the mixture from each machine, and then combine those distributions to estimate the distribution of the entire network and do this on a regular interval in a streaming fashion.
I’m looking through the documentation and get the general gist of mixture models, but the thing that I don’t understand is how to setup the initial parameters for each distribution, which one to use given the data available to me, or how to shift each distribution to the corresponding quantile to construct the overall distribution.
Is this possible with PyMC?