The following code, which is directly taken from the tutorials about variational inference API, runs fine:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
import theano.tensor as tt
import pandas as pd
import theano
import pymc3 as pm
X, y = load_iris(True)
X_train, X_test, y_train, y_test = train_test_split(X, y)
Xt = theano.shared(X_train)
yt = theano.shared(y_train)
with pm.Model() as iris_model:
# Coefficients for features
beta = pm.Normal('beta', 0, sd=1e2, shape=(4, 3))
# Transoform to unit interval
a = pm.Flat('a', shape=(3,))
p = tt.nnet.softmax(Xt.dot(beta) + a)
observed = pm.Categorical('obs', p=p, observed=yt)
with iris_model:
# We'll use SVGD
inference = pm.SVGD(n_particles=500, jitter=1)
# Local reference to approximation
approx = inference.approx
# Here we need `more_replacements` to change train_set to test_set
test_probs = approx.sample_node(p, more_replacements={Xt: X_test}, size=100)
# For train set no more replacements needed
train_probs = approx.sample_node(p)
test_ok = tt.eq(test_probs.argmax(-1), y_test)
train_ok = tt.eq(train_probs.argmax(-1), y_train)
test_accuracy = test_ok.mean(-1)
train_accuracy = train_ok.mean(-1)
eval_tracker = pm.callbacks.Tracker(
test_accuracy=test_accuracy.eval,
train_accuracy=train_accuracy.eval
)
inference.fit(100, callbacks=[eval_tracker]);
If I understand correctly, this line train_probs = approx.sample_node(p)
means it will sample only one point for the training accuracy calculation. However, if I change it into train_probs = approx.sample_node(p, size=100)
, I get the following error:
Traceback (most recent call last):
File "tt.py", line 41, in <module>
test_ok = tt.eq(test_probs.argmax(-1), y_test)
File "/cluster/zeng/code/research/software/miniconda/lib/python2.7/site-packages/theano/gof/op.py", line 674, in __call__
required = thunk()
File "/cluster/zeng/code/research/software/miniconda/lib/python2.7/site-packages/theano/gof/op.py", line 862, in rval
thunk()
File "/cluster/zeng/code/research/software/miniconda/lib/python2.7/site-packages/theano/gof/cc.py", line 1735, in __call__
reraise(exc_type, exc_value, exc_trace)
File "<string>", line 3, in reraise
ValueError: Input dimension mis-match. (input[0].shape[1] = 112, input[1].shape[1] = 38)
112 and 38 are the sizes for the training and test set respectively. So it seems to indicate that test_probs
now is messed up and the replacement doesn’t work anymore, even though it’s not touched. Why?