Variational API: meaning of parameters

aplavin · June 25, 2018, 9:03pm

I’m trying to understand what different fitting parameters mean, as it’s not completely clear from the API reference. Could anyone explain a bit please?
Namely, I’m confused with existence of two different optimizers - obj_optimizer and test_optimizer. Which one should I set to e.g. modify the optimization learning rate or something? And overall, the whole concept of test function is not clear - why the objective is not enough?
Another thing is obj_n_mc and tf_n_mc - why do we need monte-carlo to approximate the objective function gradient? As everything is written as theano functions, it should be automatically differentiable, right?

ferrine · June 27, 2018, 9:41am

Hi, I was led by OPVI paper (arXiv) when implementing and unifying VI in PyMC3. This theoretical framework assumes Objective function that is minimized and Test function that is maximized. They also proposed a novel approach for implicit VI but I did not implement it as found it not that promising. I believe that framework is very useful and decided not to change the things leaving free space for new methods in future. One can pick up OPVI and implement his own approach, I hope there is enough flexibility.

obj_n_mc – number of monte carlo samples to estimate objective function gradient
tf_n_mc – number of monte carlo samples to estimate test function gradient
test_optimizer – optimizer for test function
obj_optimizer – optimizer for objective function

All KL based methods we have rely on obj_function itself, test function for them is identity, and thus not involved in VI.
By the way we have one method that I could reformulate as OPVI special case. It is recently proposed Stein variational gradient descent (arXiv) that has nonparametric test function. As it has analytical optimum it does not require any optimization.

Finally, we do not yet need parametric test function in PyMC3 but this may change in future

aplavin · June 27, 2018, 11:12am

So, this means I only change obj_optimizer to change e.g. the learning rate, right?
And as for obj_n_mc - I saw the definition “number of monte carlo samples to estimate objective function gradient” in the reference, but it does not shed much light. I mean, why do we need to estimate the gradient at all? Theano provides the exact expression for it.

junpenglao · June 27, 2018, 1:47pm

Variational inference use ELBO as the objective function, which is an expectation over some space. Computation of expectation is difficult because it is a high dimensional integral. The parameterization trick as usually used in PyMC3 and mainstream package relied on using samples to compute the expectation; and using expectation of the gradient to substitute computing the gradient of the expectation. You can find some good reference in https://arxiv.org/pdf/1610.02287.pdf (see equation 1 and 3), and a more recent paper https://arxiv.org/pdf/1805.08498.pdf (also see eq 1 and 3)

aplavin · June 28, 2018, 9:44am

Oh, now I see, thank you!

Topic		Replies	Views
Default value used for obj_n_mc in ADVI Questions	1	722	May 16, 2018
Sampling in ADVI v3 theano , modeling , sampling , pytensor	2	82	January 12, 2025
Adding momentum to optimizers for variational inference Questions	0	386	May 21, 2020
Adding a new operator for an objective which involves gradients Questions	10	601	April 3, 2019
Perplexity of a model Questions from_github	21	2328	June 29, 2017

Variational API: meaning of parameters

Related topics