Hi! I have a regression which I fit using advi (nuts sampling is extremely slow), and would like to try minibatches because the data size is ~10^5-10^6. With plain advi fitting it reaches reasonable convergence in several hours.
The model looks like this:
with pm.Model() as model:
G0_amp = pm.Lognormal('G0_amp', 0, 1, shape=(n_1, n_2, n_3, n_4))
G0_phas = pm.VonMises('G0_phas', mu=0, kappa=1/1**2, shape=(n_1, n_2, n_3, n_4))
gain_amp = G0_amp[ix1[:, 0], ix2, :, :] * G0_amp[ix1[:, 1], ix2, :, :]
gain_phas = G0_phas[ix1[:, 0], ix2, :, :] - G0_phas[ix1[:, 1], ix2, :, :]
# ... computing mu from gain_amp and gain_phas ...
obs_r = pm.StudentT('obs_r', mu=mu[~mask], sd=sd, nu=2, observed=data[~mask].real)
obs_i = pm.StudentT('obs_i', mu=mu[~mask], sd=sd, nu=2, observed=data[~mask].imag)
So, the parameters to minibatch are ix1
(shape n*2
), ix2
(shape n
), data
, mask
(shape n*n_3*n_4
for both). The above code works ok, but I’d like to speed it up using minibatching across the n
dimension.
Trying the very intuitive code:
bs = 1000
ix1_mb = pm.Minibatch(ix1, batch_size=bs)
ix2_mb = pm.Minibatch(ix2, batch_size=bs)
data_r_mb = pm.Minibatch(data.real, batch_size=bs)
data_i_mb = pm.Minibatch(data.imag, batch_size=bs)
mask_mb = pm.Minibatch(mask, batch_size=bs)
# replace everywhere in the model <param> with <param>_mb
doesn’t seem to work: it gives way higher loss and the resulting trace is very different from the first model. My guess is that it’s probably because different minibatches give different random samples, and not those corresponding to the same indices, but this may be wrong - I’m not sure.
What should be the correct way to do this?