Hello
I am fairly new to PYMC and i wanted to understand the advi sampling.
I wanted to understand what is changing from sample to sample?
Example:
If i run the following line of code
advi_fit[channel] = pm.fit(n=200000, method='advi', callbacks=[pm.callbacks.CheckParametersConvergence(every=1000,tolerance=0.001,diff="absolute")])
As mentioned 200000 iterations are ran with average loss.
I wanted to understand what exactly is changing from iterations to iterations
I understand learnings from previous iterations are taken but i am not understanding how they are utilized for current iteration. And is there any default optimizer and learning rate used.
I would really appreciate if anybody can help me here
Thank you
You are fitting normal distributions, parameterized by a mean and standard deviation, to the latent variables in the model by minimizing the so-called “Evidence Lower Bound” (ELBO) loss function. Here is a nice video introducing the key ideas.
The default optimizer is pm.adagrad_window
. It’s default learning rate is 1e-3. These settings are admittedly not easy to find, you have to dig in the source code somewhat.
Doc PRs very welcome
Thank you replying. This did help us but i had few doubts.
- We are working with bayesian sampling and for which we are checking are there any hypermeters that we can tune if yes how can we log that and print (Except elbo).
- Is there a way i can log and print the effective learning rate and learning rate
w(t) = w(t-1) - (learning_rate / (sqrt(Gt) + epsilon)) * gradient (As this is equation of adagrad so can we log and print (learning_rate / (sqrt(Gt) + epsilon))
- I was looking into the base code and came across line “This function maximizes the evidence lower bound (ELBO) :math:
{\cal L}(\gamma, \nu, \eta)
” Are these parameters tunable and can log and print them (pymc.variational.inference — PyMC dev documentation)
Thank you