Feeding Posterior back in as Prior (updating a model)

cshenton · August 26, 2017, 3:53am

It is possible to use the MeanField and FullRank approximations as priors for a new fit?

I’m thinking of a use case where a model is trained on some initial data, the posterior approximation is stored, and then the model is updated on new data at a later date. So structurally like mini batching the data, but where the batches are observed over time (and potentially even a fully online use case, but not necessarily).

Here is a working example learning a normal mean with 20 batches of size 15, surprisingly it works (even with really small number of passes per minibatch)! Each batch gets closer to the correct value, and the parameter uncertainty sensibly gets lower and lower, which is super.

import numpy as np
import pymc3 as pm


def simulate_data():
    return np.random.normal(6.5, 1.5, 15)

def mean_model(data, prior=None):
    """Returns a model of the data mean given a prior on the mean."""
    with pm.Model() as model:
        mu = pm.Normal('mu', **prior)
        y = pm.Normal('y', mu=mu, sd=1.5, observed=data) # assume known sd
    return model

def train(model):
    """Fits the model, returns a dict of posterior distributions."""
    with model:
        result = pm.fit(2000, method='advi')
    return {
        'mu': result.mean.eval()[0],
        'sd': result.std.eval()[0],
    }

if __name__ == '__main__':
    prior = {'mu': 0.0, 'sd': 10.0} # initial prior
    for i in range(20):
        y = simulate_data()
        model = mean_model(y, prior)
        posterior = train(model)
        prior = posterior
        print(i, ':', prior)

My main questions are:

Is there a better way to do this in pymc?
- If not, what would be everyone’s dream API to make this really easy for complex models?
Is this statistically correct? (it feels so, since it’s really just applying bayes rule)

cshenton · August 26, 2017, 4:23am

Probably worth noting that this would only work if new priors could be specified using moments from the posterior distributions, which means they should probably be in the same family of distributions.

Would this restrict us to normal priors? Since PyMC3 only implements gaussian variational models?

ferrine · August 26, 2017, 6:46am

I don’t know an easy workaround yet, especially for FullRank approximation as there are interactions between variables. But I see a way to do it changing low level implementation. Good feature idea

ferrine · August 27, 2017, 6:49am

Topic		Replies	Views
How to use the posterior distribution of one model as a prior distribution for another model	19	1462	April 9, 2024
Can traces be used as priors? Questions	5	2367	September 12, 2019
Using MCMC based model in closed loop Questions	14	2485	July 29, 2019
Updating multivariate priors Questions	11	3383	March 12, 2019
Minibatch and NUTS Questions	11	3251	June 22, 2018

Feeding Posterior back in as Prior (updating a model)

Related topics