# Variational inference over cartesian product of large sets of observations

I have 2 large sets of observations and I would like to do variational inference over the cartesian product of these sets. How do I use `pymc3.Minibatch` to get representative samples.

For example suppose the observations a are vactors and I want to model the distribution of dot product of of samples from the 2 sets.

Something like:

``````model = Model()
with model:
A = pm.Minibatch(a, 100)
B = pm.Minibatch(b, 100)
C = pm.Deterministic('C', A.dot(B))
N = pm.Normal('N, 0, 100, C)
fit = pm.fit()
``````

except I think do not think the above will sample fairly from the cartesian product of `a` and `b`.

How do I do something like the above but sampling uniformly over the cartesian product of `a` and `b`?

What do you mean by:

The minibatch sync across different input so you should be fine doing this. Also, since a and b is observed you can compute C first and do the minibatch on C

Thanks! `a` and `b` are too large to precompute C particularly since the function I actually need to compute returns high dimensional vectors.

Also do I need to give the 2 Minibatches different seeds? It looks like by default menibatches all initialize to 42. Could that cause problems?

I dont think you would want to set different seed, as I understand that you would want to the minibatch to be in sync

I do not think I want them in sync. I want it possible for any item in `a` to pair with any item in `b` with equal probability. If they are in sync then I think most pairings can never happen.