Would anyone point me in the direction of deploying a pymc3 model to feed new data into?
This is a surprisingly complicated operation, and I still do not know of any comprehensive resource for “production” PyMC models.
I tried writing up why it is surprising and complicated, but got bogged down in notation (let me know if it would be helpful to try again!) In any case, you usually have to retrain with the new data and the old data.
In my experience, it is very hard to “productionize” a Bayesian model, as you will want to manually check diagnostics to make sure the model fit the data correctly, and that the model is still appropriate for the data (see, for example, https://arxiv.org/abs/1709.01449).
A good idea might be to periodically retrain a model on all available data, and develop a suite of automated checks particular to your model to catch problems. It is good practice to write functions that accept configuration and data, and returns a
pm.Model instance to allow things like this, something like:
def generate_model(data, **config): with pm.Model() as model: ... return model ...sometime later... with generate_model(X, normalize=True, include_features=8): trace = pm.sample()
Note that all of this is true of MCMC, and not particular to PyMC3, but can sometimes be improved with details about your model.