Inference with multi-dimensional data and minibatches

Hi,

The basic state of my model is, for some M by N matrix A:

x = pm.Normal('x', 0, 1, shape=N)
combined = pm.math.dot(A, x)
scaled = pm.math.sigmoid(combined)
outcomes = pm.Bernoulli('outcome', scaled, observed=Y)
pm.fit(n=50000)

This works great as-is, however it’s getting extremely slow to perform inference over as the size of my data grows. I saw the pm.Minibatch class and some documentation around it and it looked great, however if I change this to

Y_batch = pm.Minibatch(Y, batch_size=100)
x = pm.Normal('x', 0, 1, shape=N)
combined = pm.math.dot(A, x)
scaled = pm.math.sigmoid(combined)
outcomes = pm.Bernoulli('outcome', scaled, observed=Y_batch, total_size=Y.shape)
pm.fit(n=50000)

I get the error

Input dimension mis-match. (input[0].shape[0] = 100, input[1].shape[0] = 10000)

I’m guessing this is because here the observed RV scaled is not a scalar which we’ve observed however many times, but rather a vector with a one-to-one correspondence to rows in Y. Is there anyway to iterate through this in minibatches?

edit: Actually testing with larger datasets I also am sometimes getting the error

 The current approximation of RV `x`.ravel()[8051] is NaN.

at every index of x