Weird thing about approximation.sample_node

The following code, which is directly taken from the tutorials about variational inference API, runs fine:

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import theano.tensor as tt
import pandas as pd
import theano
import pymc3 as pm

X, y = load_iris(True)
X_train, X_test, y_train, y_test = train_test_split(X, y)

Xt = theano.shared(X_train)
yt = theano.shared(y_train)

with pm.Model() as iris_model:

    # Coefficients for features
    beta = pm.Normal('beta', 0, sd=1e2, shape=(4, 3))
    # Transoform to unit interval
    a = pm.Flat('a', shape=(3,))
    p = tt.nnet.softmax(Xt.dot(beta) + a)

    observed = pm.Categorical('obs', p=p, observed=yt)



with iris_model:

    # We'll use SVGD
    inference = pm.SVGD(n_particles=500, jitter=1)

    # Local reference to approximation
    approx = inference.approx

    # Here we need `more_replacements` to change train_set to test_set
    test_probs = approx.sample_node(p, more_replacements={Xt: X_test}, size=100)

    # For train set no more replacements needed
    train_probs = approx.sample_node(p)


test_ok = tt.eq(test_probs.argmax(-1), y_test)
train_ok = tt.eq(train_probs.argmax(-1), y_train)
test_accuracy = test_ok.mean(-1)
train_accuracy = train_ok.mean(-1)


eval_tracker = pm.callbacks.Tracker(
    test_accuracy=test_accuracy.eval,
    train_accuracy=train_accuracy.eval
)

inference.fit(100, callbacks=[eval_tracker]);

If I understand correctly, this line train_probs = approx.sample_node(p) means it will sample only one point for the training accuracy calculation. However, if I change it into train_probs = approx.sample_node(p, size=100), I get the following error:

Traceback (most recent call last):
 File "tt.py", line 41, in <module>
   test_ok = tt.eq(test_probs.argmax(-1), y_test)
 File "/cluster/zeng/code/research/software/miniconda/lib/python2.7/site-packages/theano/gof/op.py", line 674, in __call__
   required = thunk()
 File "/cluster/zeng/code/research/software/miniconda/lib/python2.7/site-packages/theano/gof/op.py", line 862, in rval
   thunk()
 File "/cluster/zeng/code/research/software/miniconda/lib/python2.7/site-packages/theano/gof/cc.py", line 1735, in __call__
   reraise(exc_type, exc_value, exc_trace)
 File "<string>", line 3, in reraise
ValueError: Input dimension mis-match. (input[0].shape[1] = 112, input[1].shape[1] = 38)

112 and 38 are the sizes for the training and test set respectively. So it seems to indicate that test_probs now is messed up and the replacement doesn’t work anymore, even though it’s not touched. Why?

Moreover, if I further add another line dummy = approx.sample_node(p) below the new definition of train_probs, the error goes away. What’s going on here?

see: https://github.com/pymc-devs/pymc3/issues/3130#issuecomment-409891412

Thanks but I don’t think it’s the same issue. In my code test_ok and train_ok are always outside. The code from the tutorial runs fine. It’s when I set size=100 that I have this error.

@ferrine Do you have some idea on this, seems like a cache issue?

Hi! I’ll investigate today

The reason is a test value. Theano has a very tricky way to manage them. It is very hard to track them if you do smth like theano.clone, in this case test value is not changed and you gent an error. This issue is known to me and I had hard days fixing it internally in VI module. Seems like not all corner cases are covered (don’t thing it is ever possible). I suggest pinning theano.config.compute_test_value = 'off' after you have defined the model.

1 Like