I am using Minibatch of size 50 to speedup ADVI optimization. It works well when the model does not have total_size = data.shape argument, for example below model works
obs = pm.Poisson('obs', lambdas, observed=data_out)
But when I do
obs = pm.Poisson('obs', lambdas, observed=data_out, total_size = data.shape)
I get an error that lambda is too large for Poisson. The size of the data is 20,000. What is the right argument in this case?
Also, how can I get the posterior prediction of the data after using a minibatch? I am using the below commands
with mod:
inference = pm.ADVI()
tracker = pm.callbacks.Tracker(
mean= inference.approx.mean.eval, # callable that returns mean
std= inference.approx.std.eval # callable that returns std
)
approx = pm.fit(n= 2000, callbacks=[tracker,pm.callbacks.CheckParametersConvergence(tolerance=1e-4)], method=inference, obj_optimizer=pm.adam(learning_rate=0.05))
idata = approx.sample(1000)
with mod:
posterior = pm.sample_posterior_predictive(idata, extend_inferencedata=True)
post_pred = az.extract(posterior, 'posterior_predictive').obs
The size of post_pred is the size of the minibatch and does not match the original shape of the data. How can I extract the posterior prediction of the original data?