Hello guys!
I have model that is basically Linear. The thing you need to know x is my input data and y – my likelihood. Model can have 2 or more shapes.
I gonna use my model together with scipy.optimize to find optimal parameters that will satisfy me. For that purpose I made function that replace data in model and perform out-of-sample predictions. I used that function as part of my constraint function. The problem is that it performs SUPER SLOW as optimizations goes step by step and each time it replaces data in model and do ppc.
I want to speed up this process as I do not need all samples from posterior_predictive, I need only mean of that, thus I do not need to sample 10000 times, but only 1 time using mean of posteriors (like mean slope and mean intercept).
In real life I have not only slope and intercept, so i can’t just do y = x * slope + intercept.
So, how I can do that? Probably I can extract “formula (pytensor)” somehow from model and then put there means of posteriors and new x values?
The idea is to make such kind of optimizations as in pymc_marketing.mmm.budget_optimizer.BudgetOptimizer
But I need more control on them
I wrote a very minimal workflow sometime ago that you may find useful: PyMC_optimization.ipynb · GitHub
1 Like
If you take the mean of the posterior then optimize you will get the wrong answer due to Jensen’s Inequality. You want \mathbb{E}[f(x)], but you are computing f(\mathbb{E}[x]). You can consider putting the optimization into the model itself, as in here. You would need to add gradients for the optimizer using the implicit value theorem for any non-trivial problem.
If the model is essentially linear though, I would be looking for an approximate closed-form solution rather than running an optimizer.
1 Like
I usually just use a thinned version of the posterior and then take the mean of all the losses
One can also use the mean of the observed variable: pymc/pymc/distributions/moments/means.py at main · pymc-devs/pymc · GitHub
We could probably add the variance if interested in considering that in the optimization
One could use the mean if the optimizer was already on the graph informing that mean, but not otherwise
It’s still the same point as above, that for a non-linear function f, E[f(x)] \neq f(E[x]).
1 Like
One could use the mean if the optimizer was already on the graph informing that mean, but not otherwise
I have no idea what an optimizer inside a graph informing a mean means.
What I was saying is if you have, say, a LogNormal observational model and want to optimize the expected value you can optimize the “analytical mean”/first moment of the distribution from the posterior draws of the parameters of that LogNormal and whatever parameters you want to tune.
You don’t need to draw a bunch of posterior predictive draws to then take its mean.
1 Like
Guys, thank you for links. It’s a bit hard for me to understand everything that happening there, nevertheless I will spend more time this week. Seems like @ricardoV94 explained my idea clearly in his last message (this is exactly what I thought I needed).
@jessegrabowski can you please share an example of non-linear function, just to understand the difference compare to linear?