Error On Bayesian Neural Network

Hello. I’m getting an error trying to construct a bayesian neural network. It’s a shape error but I’m not sure where the error is getting thrown. The code is below:

ann_input = tt.shared(X_train)
ann_output = tt.shared(y_train)

n_hidden = 5

Initialize random weights between each layer

init_1 = np.random.randn(X_train.shape[1], n_hidden)
init_2 = np.random.randn(n_hidden, n_hidden)
init_out = np.random.randn(n_hidden)

with pm.Model() as neual_network:
# Weights from input to hidden layer
weights_in_1 = pm.Normal(‘w_in_1’, 0, sd=1,
shape=(X_train.shape[1], n_hidden),

# Weights from 1st to 2nd layer
weights_1_2 = pm.Normal('w_1_2', 0, sd=1, 
                        shape=(n_hidden, n_hidden), 

# Weights from hidden layer to output
weights_2_out = pm.Normal('w_2_out', 0, sd=1, 

# Build neural-network using tanh activation function
act_1 = pm.math.tanh(, 
act_2 = pm.math.tanh(, 
act_out =, weights_2_out)

out = pm.Normal('out', mu = act_out,  observed=ann_output, shape = y_train.shape[0])

The shape of X_train and y_train are:

(151437, 12)
(151437, 1)

The error call is:

ValueError Traceback (most recent call last)
34 # Binary classification -> Bernoulli likelihood
—> 35 out = pm.Normal(‘out’, mu = act_out, observed=ann_output, shape = y_train.shape[0])

~/anaconda3/lib/python3.7/site-packages/pymc3/distributions/ in new(cls, name, *args, **kwargs)
40 total_size = kwargs.pop(‘total_size’, None)
41 dist = cls.dist(*args, **kwargs)
—> 42 return model.Var(name, dist, data, total_size)
43 else:
44 raise TypeError(“Name needs to be a string but got: {}”.format(name))

~/anaconda3/lib/python3.7/site-packages/pymc3/ in Var(self, name, dist, data, total_size)
837 var = ObservedRV(name=name, data=data,
838 distribution=dist,
–> 839 total_size=total_size, model=self)
840 self.observed_RVs.append(var)
841 if var.missing_values:

~/anaconda3/lib/python3.7/site-packages/pymc3/ in init(self, type, owner, index, name, data, distribution, total_size, model)
1323 self.missing_values = data.missing_values
-> 1324 self.logp_elemwiset = distribution.logp(data)
1325 # The logp might need scaling in minibatches.
1326 # This is done in Factor.

~/anaconda3/lib/python3.7/site-packages/pymc3/distributions/ in logp(self, value)
478 mu =
–> 480 return bound((-tau * (value - mu)**2 + tt.log(tau / np.pi / 2.)) / 2.,
481 sd > 0)

~/anaconda3/lib/python3.7/site-packages/theano/tensor/ in sub(self, other)
145 # and the return value in that case
146 try:
–> 147 return theano.tensor.basic.sub(self, other)
148 except (NotImplementedError, AsTensorError):
149 return NotImplemented

~/anaconda3/lib/python3.7/site-packages/theano/gof/ in call(self, *inputs, **kwargs)
672 thunk.outputs = [storage_map[v] for v in node.outputs]
–> 674 required = thunk()
675 assert not required # We provided all inputs

~/anaconda3/lib/python3.7/site-packages/theano/gof/ in rval()
861 def rval():
–> 862 thunk()
863 for o in node.outputs:
864 compute_map[o][0] = True

~/anaconda3/lib/python3.7/site-packages/theano/gof/ in call(self)
1737 print(self.error_storage, file=sys.stderr)
1738 raise
-> 1739 reraise(exc_type, exc_value, exc_trace)

~/anaconda3/lib/python3.7/site-packages/ in reraise(tp, value, tb)
691 if value.traceback is not tb:
692 raise value.with_traceback(tb)
–> 693 raise value
694 finally:
695 value = None

ValueError: Input dimension mis-match. (input[0].shape[1] = 1, input[1].shape[1] = 151437)

I don’t believe specifying the shape does anything here, it is set by the observed matrix.

So looking at the error, the second dimension of act_out is shape n while the second dimension of y_train is 1. So ann_output is what you expected, the problem is with act_out. Try taking the transpose of it to see if one it’s just on its side. If that doesn’t work take the transpose of y_train before setting your observed variable, since I’ve had trouble transposing vectors in pymc3 (as I describe below).

If neither transpose works, it may be a problem farther up the line. I have had trouble with shape of random variables in pymc3 in the past. Someone who is more knowledge of how these shapes are handled under the hood may be able to give you a more disciplined answer, but in the meantime hopefully this will set you in the right direction.

I’ve had trouble with not being able to change the shape of single-dimensional random variables in particular. So the first place I’d look is weights_2_out. Setting the dimensions in shape didn’t allow me to define it as a column vector instead of a row vector, and doing a transpose didn’t transform it either. That said, if if I’m not mistaken about the shapes of the other matrices, if weights_2_out were wrong, it would have failed during the dot.

So if that transpose doesn’t work, I’d play with a toy model that you build up piece by piece to check the shape of act matrix at each step.

Something like this might help:

bar = np.random.rand(30)
with pm.Model() as test:
    m = pm.Normal('m',0,sd=10,shape=20)
    foo = pm.Normal('out',mu=m,observed = bar)

Note that this model fails because m is not the same shape as bar. So just make bar the shape you expect the output of each act step to be, and see if the model compiles. If it doesn’t, transpose bar to see if it’s that shape instead and use that to narrow down where the shape problem is occurring that might make act_out shape 1.

1 Like

Thank you so much @lwahedi. I’m not sure why this works and to be honest, I’m still trying to figure out the structure of the neural net (compared to a keras-build CNN). It’s running great now. Thanks again.