Minibatch when latent variable size depends on data dimension


#1

Hi, even after reading examples/posts, I’m probably not understanding how Minibatches work, so I’m hoping someone can straighten me out. Here is a simple toy model that is similar to something bigger I’m trying to fit (but where I run out of memory if I try to use the full data set at once). Goals include inferring sig_mu and also predicting mu for each trial (so I want 1000 mu predictions at the end).

import pymc3
import numpy as np
import theano

# Generate data
ntrials = 1000
sig_mu_true = 0.5
mu_true = np.random.normal(0, sig_mu_true, size=ntrials)
Y = mu_true + np.random.normal(0, 0.1, size=ntrials)

# Minibatch
ntrials_mb = 200
Y_mb = pymc3.Minibatch(Y, ntrials_mb)
Y_mb_shared = theano.shared(Y[:ntrials_mb])

# Set up model
model = pymc3.Model()

with model:
    sig_mu = pymc3.HalfNormal('sig_mu', sd=2.)
    mu = pymc3.Normal('mu', 0, sd=sig_mu, shape=ntrials_mb, total_size=ntrials)
    Y_obs = pymc3.Normal('Y_obs', mu=mu, sd=0.1, observed=Y_mb_shared, total_size=ntrials)

# Use ADVI
with model:
    approx = pymc3.fit(20000, more_replacements={Y_mb_shared:Y_mb})

This seems to converge but does not get sig_mu correct at all.
If I try trace = approx.sample(1000) I only get trace['mu'].shape as 200 (not 1000).
I also tried mu_trace = approx.sample_node(approx.model.mu,500,more_replacements={Y_mb_shared: Y}).eval() to no avail.

Is what I’m trying to do here possible? The problem seems to be partly that the shape of mu depends on the size of the data… but maybe it’s more than that, as I’m also not recovering sig_mu.

Thanks.


#2

Yeah you can do it two ways:

# Generate data
ntrials = 1000
sig_mu_true = 0.5
mu_true = np.random.normal(0, sig_mu_true, size=ntrials)
X = np.arange(0, ntrials)
Y = mu_true + np.random.normal(0, 0.1, size=ntrials)
# Minibatch
ntrials_mb = 200
Y_mb = pm.Minibatch(Y, ntrials_mb)
X_mb = pm.Minibatch(X, ntrials_mb)

# Set up model
with pm.Model() as m:
    sig_mu = pm.HalfNormal('sig_mu', sd=2.)
    mu = pm.Normal('mu', 0, sd=sig_mu, shape=ntrials)
    Y_obs = pm.Normal('Y_obs', 
                      mu=mu[X_mb], 
                      sd=0.1, 
                      observed=Y_mb, 
                      total_size=ntrials)

# Use ADVI
with m:
    approx = pm.fit(40000, obj_n_mc=5)

or

# Minibatch
ntrials_mb = 200
Y_mb = pm.Minibatch(Y, ntrials_mb)
X_mb = pm.Minibatch(X, ntrials_mb)
Y_shared = theano.shared(Y)
X_shared = theano.shared(X)

# Set up model
with pm.Model() as m2:
    sig_mu = pm.HalfNormal('sig_mu', sd=2.)
    mu = pm.Normal('mu', 0, sd=sig_mu, shape=ntrials)
    Y_obs = pm.Normal('Y_obs', 
                      mu=mu[X_shared], 
                      sd=0.1, 
                      observed=Y_shared, 
                      total_size=ntrials)

# Use ADVI
with m2:
    approx = pm.fit(50000, 
                    obj_n_mc=5, 
                    more_replacements={X_shared: X_mb,
                                       Y_shared: Y_mb})

#3

Ah, thank you! I didn’t think of having a shared variable for the indices. :slight_smile: Will be trying this now.