PyMC3: How to compute MAP Estimates Repeatedly and Efficiently?

What is the most efficient way to calculate MAP estimates repeatedly in PyMC3 with updated input?

Details: I would like to calculate repeatedly MAP (max a posteriori) estimates. I need Bayesian sparse linear regression because I have too few data points for a regular regression to be stable.

However, repeated MAP calculations take too long in PyMC3. One thing I can usually do in such situations is precompile the function, like I can do in Theano, but PyMC3 doesn’t allow precompilation of the function without freezing the input as well, from what I can tell. In my use-case, I’m applying the regression on a gigantic rolling Pandas dataframe with updated input each time, so freezing the input is bad.

Thank you for any ideas/help.

Are you trying to use the MAP estimate as a prior (?) for next modeling step where you have a new batch of data?

You can use shared variables from theano:

y = theano.shared(datasets[0])

with Model() as mode:
    ...
    pm.Normal('y', ..., observed=y)

for y_ in datasets:
    y.set_value(y_)
    with model:
        map = pm.find_MAP()

Sadly, you can’t change the shape of unobserved variables this way at the moment. But if you only switch out observed variables you should be fine.

On a side node: Are you sure you want to use map estimates? nuts is usually quite performant even with large models, and there is no good reason to expect the map to be reasonable is general, especially in high dimensions.

3 Likes