Can I make out-of-sample predictions using posterior means? Or how to use scipy.optimize with pymc predictions?

angrydozer · December 7, 2024, 7:47pm

Hello guys!

I have model that is basically Linear. The thing you need to know x is my input data and y – my likelihood. Model can have 2 or more shapes.

I gonna use my model together with scipy.optimize to find optimal parameters that will satisfy me. For that purpose I made function that replace data in model and perform out-of-sample predictions. I used that function as part of my constraint function. The problem is that it performs SUPER SLOW as optimizations goes step by step and each time it replaces data in model and do ppc.

I want to speed up this process as I do not need all samples from posterior_predictive, I need only mean of that, thus I do not need to sample 10000 times, but only 1 time using mean of posteriors (like mean slope and mean intercept).

In real life I have not only slope and intercept, so i can’t just do y = x * slope + intercept.

So, how I can do that? Probably I can extract “formula (pytensor)” somehow from model and then put there means of posteriors and new x values?

angrydozer · December 7, 2024, 7:49pm

The idea is to make such kind of optimizations as in pymc_marketing.mmm.budget_optimizer.BudgetOptimizer

But I need more control on them

ricardoV94 · December 7, 2024, 8:43pm

I wrote a very minimal workflow sometime ago that you may find useful: PyMC_optimization.ipynb · GitHub

jessegrabowski · December 8, 2024, 1:00am

If you take the mean of the posterior then optimize you will get the wrong answer due to Jensen’s Inequality. You want \mathbb{E}[f(x)], but you are computing f(\mathbb{E}[x]). You can consider putting the optimization into the model itself, as in here. You would need to add gradients for the optimizer using the implicit value theorem for any non-trivial problem.

If the model is essentially linear though, I would be looking for an approximate closed-form solution rather than running an optimizer.

ricardoV94 · December 8, 2024, 9:41am

I usually just use a thinned version of the posterior and then take the mean of all the losses

ricardoV94 · December 8, 2024, 10:21am

One can also use the mean of the observed variable: pymc/pymc/distributions/moments/means.py at main · pymc-devs/pymc · GitHub

We could probably add the variance if interested in considering that in the optimization

jessegrabowski · December 8, 2024, 8:42pm

One could use the mean if the optimizer was already on the graph informing that mean, but not otherwise

ricardoV94 · December 8, 2024, 8:54pm

I don’t follow?

jessegrabowski · December 8, 2024, 9:12pm

It’s still the same point as above, that for a non-linear function f, E[f(x)] \neq f(E[x]).

ricardoV94 · December 8, 2024, 9:36pm

One could use the mean if the optimizer was already on the graph informing that mean, but not otherwise

I have no idea what an optimizer inside a graph informing a mean means.

What I was saying is if you have, say, a LogNormal observational model and want to optimize the expected value you can optimize the “analytical mean”/first moment of the distribution from the posterior draws of the parameters of that LogNormal and whatever parameters you want to tune.

You don’t need to draw a bunch of posterior predictive draws to then take its mean.

angrydozer · December 8, 2024, 11:38pm

Guys, thank you for links. It’s a bit hard for me to understand everything that happening there, nevertheless I will spend more time this week. Seems like @ricardoV94 explained my idea clearly in his last message (this is exactly what I thought I needed).

@jessegrabowski can you please share an example of non-linear function, just to understand the difference compare to linear?

Topic		Replies	Views
How to make out-of-sample predictions with pymc model v5	1	648	February 8, 2023
Example for out-of-sample prediction with posterior predictive sampling v5	8	3123	October 28, 2022
PyMC v5.10.3 prediction stuff v5 modeling	11	451	January 15, 2024
How to use the posterior predictive distribution for checking a model from PyMC version agnostic arviz , model-checking	10	4144	March 14, 2023
Calculated Values not matching Posterior Predictive version agnostic modeling	8	1028	May 29, 2022

Can I make out-of-sample predictions using posterior means? Or how to use scipy.optimize with pymc predictions?

Related topics