PPC with Minibatch ADVI


#1

Hi there,

following Issue 2190 I have a follow up question (Thank you very much for your first responses by the way!):

If I would take exactly the example code and wanted to use minibatch, I would first create a shared parameter for my model like:

ann_input = theano.shared(X_train[:50, ...].astype(np.float32))
ann_output = theano.shared(Y_train[:50, ...].astype(np.float32))

Next I would create the minibatch variables like

minibatch_x = pm.Minibatch(X_train, batch_size=50, dtype='float32')
minibatch_y = pm.Minibatch(Y_train, batch_size=50, dtype='float32')

and my run_advi-Method would look like this:

def run_advi(advi_iters=3000):
# Inference Method
inference = pm.ADVI()

# Approximation approx = pm.fit(advi_iters, method=inference, more_replacements={ann_input:minibatch_x, ann_output:minibatch_y})

This way I can later change my input_var to be whatever I want as @ferrine pointed out.

The problem is, that I can’t just replace ann_input with minibatch_x, as they have different types (once a Theano variable with CudaNdarrayType and once a Minibatch with type TensorType)

As far as my understanding of minibatches is (with the help of This instruction), the value of the minibatch is in minibatch_x.minibatch. So one way would be to replace ann_input with minibatch_x.minibatch.tag.test_value, although I highly doubt that this could be a right solution.
In fact the variational inference works, but the results are much worse than without minibatch.

Can you help me understand this issue?
I think I’m almost there but can’t figure out the last part of the solution.

Thank you very much in advance!


#2

you can try to cast some operation on CudaNdarrayType, for example theano.compile.view_op that will change type and you’ll be able to do replacements hopefully. I have not tested it yet but it seems like valid workaround


#3

Thank you very much for your answer! Indeed your suggestion does the trick!


#4

@Pflip
What does your eventual code look like? I am trying to do the same, I am not sure I understand the casting that @ferrine was talking about. Could you post a sample code snippet? Thank you,

Here is where I am:

#theano shared vars
minibatch_x = pm.Minibatch(X_train, batch_size=500, dtype='float64')
minibatch_y = pm.Minibatch(y_train, batch_size=500, dtype='float64')

def build_ann(init, in_var, out_var):
    with pm.Model() as cnn:
        network = lasagne.layers.InputLayer(shape=(None, 1, 28, 28), input_var=in_var)
        network = lasagne.layers.DenseLayer(network, num_units=25, nonlinearity=lasagne.nonlinearities.tanh,
                                          b=init,
                                          W=init)
        network = lasagne.layers.DenseLayer(network, num_units=25, nonlinearity=lasagne.nonlinearities.tanh,
                                          b=init,
                                          W=init)
        #Final layer of 10 units, softmax across 10 labels
        network = lasagne.layers.DenseLayer(network, num_units=10, nonlinearity=lasagne.nonlinearities.softmax,
                                         b=init,
                                         W=init)
        prediction = lasagne.layers.get_output(network)
        #categorical distribution
        out = pm.Categorical('out', p=prediction, observed=out_var, total_size=y_train.shape[0])
    return cnn

#This needs to be corrected !!! do not pass minibatch here
cnn = build_ann(GaussianWeights(), minibatch_x, minibatch_y)

with cnn:
    # Needs to be corrected !!!! how to supply more_replacements?
    approx = pm.fit(50, method='advi', 
                   # more_replacements={in_var:minibatch_x, out_var:minibatch_y}
                   )
    trace = approx.sample(draws=620)
    ppc = pm.sample_ppc(trace, samples=100)
    
    #Predict on test data
    input_var.set_value(X_test)
    target_var.set_value(y_test)
    y_pred = mode(ppc['out'], axis=0).mode[0, :]