Shape mismatch using theano shared variable

Hello,

This is the first time I’ve tried to use theano shared variables in a model. I’m getting the following error and I’m not sure what exactly to change. What’s the correct way to setup the shape argument using shared variables?

ValueError: Input dimension mis-match. (input[0].shape[1] = 161533, input[1].shape[1] = 8)

target = train[‘target’]
x = train[colnames]

X_train, X_test, Y_train, Y_test = train_test_split(x, target, test_size=0.2)

model_input_train = tt.shared(np.array(X_train))
model_output_train = tt.shared(np.array(Y_train))

model_input_test = tt.shared(np.array(X_test.values))
model_output_test = tt.shared(np.array(Y_test.values))

with pm.Model() as model:
alpha = pm.Normal(‘alpha’, mu = 0, sd = 100)
beta = pm.Normal(‘beta’, mu = 0, sd = 100, shape = (1, len(X_test.columns)))

s = pm.HalfNormal('s', tau=1)

mean = alpha + beta*model_input_train

y = pm.Normal('y', mu=mean , sd=s, observed=model_output_train)

Try:

beta = pm.Normal(‘beta’, mu = 0, sd = 100, shape = (len(X_test.columns), 1))
mean = alpha + pm.math.dot(model_input_train, beta)

Thank you for the response @junpenglao.

I did as you suggested and now get a memory error which doesn’t make sense. If it’s tied to memory, I’m running 32 gbs. Below is the traceback.

MemoryError Traceback (most recent call last)
in
7 mean = alpha + pm.math.dot(model_input_train, beta)
8
----> 9 y = pm.Normal(‘y’, mu=mean , sd=s, observed=model_output_train)

~/anaconda3/lib/python3.7/site-packages/pymc3/distributions/distribution.py in new(cls, name, *args, **kwargs)
40 total_size = kwargs.pop(‘total_size’, None)
41 dist = cls.dist(*args, **kwargs)
—> 42 return model.Var(name, dist, data, total_size)
43 else:
44 raise TypeError(“Name needs to be a string but got: {}”.format(name))

~/anaconda3/lib/python3.7/site-packages/pymc3/model.py in Var(self, name, dist, data, total_size)
837 var = ObservedRV(name=name, data=data,
838 distribution=dist,
–> 839 total_size=total_size, model=self)
840 self.observed_RVs.append(var)
841 if var.missing_values:

~/anaconda3/lib/python3.7/site-packages/pymc3/model.py in init(self, type, owner, index, name, data, distribution, total_size, model)
1322
1323 self.missing_values = data.missing_values
-> 1324 self.logp_elemwiset = distribution.logp(data)
1325 # The logp might need scaling in minibatches.
1326 # This is done in Factor.

~/anaconda3/lib/python3.7/site-packages/pymc3/distributions/continuous.py in logp(self, value)
478 mu = self.mu
479
–> 480 return bound((-tau * (value - mu)**2 + tt.log(tau / np.pi / 2.)) / 2.,
481 sd > 0)
482

~/anaconda3/lib/python3.7/site-packages/theano/tensor/var.py in sub(self, other)
145 # and the return value in that case
146 try:
–> 147 return theano.tensor.basic.sub(self, other)
148 except (NotImplementedError, AsTensorError):
149 return NotImplemented

~/anaconda3/lib/python3.7/site-packages/theano/gof/op.py in call(self, *inputs, **kwargs)
672 thunk.outputs = [storage_map[v] for v in node.outputs]
673
–> 674 required = thunk()
675 assert not required # We provided all inputs
676

~/anaconda3/lib/python3.7/site-packages/theano/gof/op.py in rval()
860
861 def rval():
–> 862 thunk()
863 for o in node.outputs:
864 compute_map[o][0] = True

~/anaconda3/lib/python3.7/site-packages/theano/gof/cc.py in call(self)
1737 print(self.error_storage, file=sys.stderr)
1738 raise
-> 1739 reraise(exc_type, exc_value, exc_trace)
1740
1741

~/anaconda3/lib/python3.7/site-packages/six.py in reraise(tp, value, tb)
691 if value.traceback is not tb:
692 raise value.with_traceback(tb)
–> 693 raise value
694 finally:
695 value = None

MemoryError: None

Just an update, this runs when the data is not a theano shared variable. So does this have something to do with theano? When I tried to run this in a kaggle kernal, I got the same memory error so I don’t think it has something to do with my computer memory but who knows?

Hmmm is X_train a pandas array? Try X_train.values

When running theano, I cast X_train as a np.array an rename model_input_train. See code below.

target = train[‘target’]
x = train[colnames]

X_train, X_test, Y_train, Y_test = train_test_split(x, target, test_size=0.2)

model_input_train = tt.shared(np.array(X_train))
model_output_train = tt.shared(np.array(Y_train))

When I tried model_input_train.values, it states ‘values’ aren’t a theano function.

what is the shape of mean.tag.test_value and model_output_train.tag.test_value?

I apologize @junpenglao but I’m not following what mean.tag.test_value and model_output_train.tag.test_value actually are.

So mean and model_output_train are theano tensor in your model above. I just want to double check on their shape (you need to use *.tag.test_value to check shape otherwise the shape is also a tensor)

Yes but when I do

model_input_train.tag.test_value

, I get

AttributeError: ‘scratchpad’ object has no attribute ‘test_value’

And when I do

mean.tag.test_value

I get,

NameError: name ‘mean’ is not defined

What is your current model code?

model_input_train = tt.shared(np.array(X_train))
model_output_train = tt.shared(np.array(Y_train))

with pm.Model() as model:
alpha = pm.Normal(‘alpha’, mu = 0, sd = 100)
beta = pm.Normal(‘beta’, mu = 0, sd = 100, shape = (len(X_test.columns), 1))

s = pm.HalfCauchy('s', 5)

mean = alpha + pm.math.dot(model_input_train, beta)

y = pm.Normal('y', mu=mean , sd=s, observed=model_output_train)

trace = pm.sample(draws=5000, init='advi', progressbar=True)
print(sales_model.check_test_point())

Hmm I really dont see any reason it would be out of memory… and the len(X_test.columns) is?

the test set…

i mean the value (just want to get the sense of how large is your data input)

8 columns
40,384 observations

Could you try:

import theano
model_input_train = theano.shared(np.random.randn(40384, 8))
model_output_train = theano.shared(np.random.randn(40384, 1))

with pm.Model() as model:
  alpha = pm.Normal('alpha', mu = 0, sd = 100)
  beta = pm.Normal('beta', mu = 0, sd = 100, shape = (8, 1))
  
  s = pm.HalfCauchy('s', 5)
  mean = alpha + pm.math.dot(model_input_train, beta)
  
  y = pm.Normal('y', mu=mean , sd=s, observed=model_output_train)
  
  trace = pm.sample(draws=5000, init='advi', progressbar=True)

It’s running fine.

Make sure it’s shape is (n, 1) - cant think of anything else could explain the error

ahhh. it’s not 1. SKLEARN made it (n, ). Does that mean the Y_train is actually an index? Do you know how to fix that?