Can traces be used as priors?


Is it possible to do online learning with PyMC3? One scenario I’m thinking about is when we get one new observation, we’d like to get the posterior. After getting that posterior, I’d like to use that as a prior for the next time I get an observation. This is in contrast to getting all the data at once, fitting a model, then getting a posterior distribution.


1 Like

So there’s a notebook on updating priors:

and a couple discussions on the forums for how to do this in a multivariate way either approximately:

or using copulas


Thank you! Exactly what I was looking for. :slight_smile:

@chartl (or anyone else) do you know if the method in updating priors notebook can be used to approximate a beta distribution? I see the line:

x = np.concatenate([[x[0] - 3 * width], x, [x[-1] + 3 * width]])

Does it make sense to modify it to something like:

    min_explore_x = x[0] - 3 * width
    max_explore_x = x[-1] + 3 * width
    x = np.concatenate([
        [max(min_explore_x, 0.0001)],
        [min(max_explore_x, 0.9999)]

so that support is always between 0 and 1?

Yes - although since this is using an interpolated distribution anyway, my recommendation is to keep everything on an unconstrained space (say, logit(x)), and set the extrapolation limits quite wide. This is consistent with the Automatic Transformation of Constrained Variables part of ADVI (

Also, since you’ve brought up a parametric distribution (Beta), why not put the interpolated distribution over hyperparameters \alpha and \beta? You could probably get away with treating \log \alpha, \log \beta as multivariate normal, too.

That’s a good point! Thanks. I modified my code to keep the original from_posterior method in place. However, I started running into some issues. With the project that I’m working on, a bunch of the variables are unobserved. Since the from_posterior function adds 3 times the width of the old posterior to the left and the right of the old posterior, the new priors of these unobserved variables become much wider than the original prior. So I’m adding some small constant instead. This prevents the blowup of the prior, while still giving the possibility of exploring values it hasn’t sampled in the past. Do you think this makes sense?

Also another question: Would the usual model comparison work? Let’s say I have two models that made use of from_posterior. Would give sensible WAIC values (as if I ran the two models on the whole data set instead of updating my priors one by one)?