I am currently playing with ADVI and minibatches. I tried different size of minibatch size and I get very different results. I keep the n parameter in pm.fit constant to 10000 and the draws variable in sample to 10000 also…
inference = pm.ADVI()
approx = pm.fit(method=inference, n=10000)
approx.sample(draws=10000)
If I use minibatch with a size of 100, it finish in 37s, but if I test on unseen data (with PPC) I get an R-squared of 76% when I usually get 97-98% when using normal ADVI without minibatches. If I gradually increase batch size it seems to help or if I increase the pm.fit n parameter… Where could I get the intuition of setting those parameters to sensible values?
Thanks,