Still having troubles with neural net shapes

Hello,

I’ve been working with a bayesian neural net and while I thought I had it, it seems not the case.

This is the error I’m gettiing.

ValueError: shapes (161533,53) and (128,53) not aligned: 53 (dim 1) != 128 (dim 0)

The shape of my input is (161533, 53)

My model is as follows:

ann_input = tt.shared(np.asarray(X_train))
y_train_t = Y_train.transpose()
ann_output = tt.shared(np.asarray(y_train_t))

#n_hidden = 10

Initialize random weights between each layer

init_1 = np.random.randn(128, X_train.shape[1])
init_2 = np.random.randn(128, 128)
init_3 = np.random.randn(64, 128)
init_4 = np.random.randn(64, 64)
init_5 = np.random.randn(32, 64)
init_6 = np.random.randn(32,32)
init_7 = np.random.randn(16, 32)
init_8 = np.random.randn(16, 16)
init_9 = np.random.randn(8, 16)
#init_10 = np.random.randn(n_hidden, n_hidden)
init_out = np.random.randn(1, 8)

with pm.Model() as neual_network:
# Weights from input to hidden layer
weights_in_1 = pm.Normal(‘w_in_1’, 0, sd=1,
shape=(128, X_train.shape[1]),
testval=init_1)

# Weights from 1st to 2nd layer
weights_1_2 = pm.Normal('w_1_2', 0, sd=1, 
                        shape=(128, 128), 
                        testval=init_2)

    # Weights from 1st to 2nd layer
weights_2_3 = pm.Normal('w_2_3', 0, sd=1, 
                        shape=(64, 128), 
                        testval=init_3)

    # Weights from 1st to 2nd layer
weights_3_4 = pm.Normal('w_3_4', 0, sd=1, 
                        shape=(64, 64), 
                        testval=init_4)

        # Weights from 1st to 2nd layer
weights_4_5 = pm.Normal('w_4_5', 0, sd=1, 
                        shape=(32, 64), 
                        testval=init_5)

            # Weights from 1st to 2nd layer
weights_5_6 = pm.Normal('w_5_6', 0, sd=1, 
                        shape=(32, 32), 
                        testval=init_6)

                # Weights from 1st to 2nd layer
weights_6_7 = pm.Normal('w_6_7', 0, sd=1, 
                        shape=(16, 32), 
                        testval=init_7)

                # Weights from 1st to 2nd layer
weights_7_8 = pm.Normal('w_7_8', 0, sd=1, 
                        shape=(16, 16), 
                        testval=init_8)

                    # Weights from 1st to 2nd layer
weights_8_9 = pm.Normal('w_8_9', 0, sd=1, 
                        shape=(8, 16), 
                        testval=init_9)


# Weights from hidden layer to output
weights_10_out = pm.Normal('w_10_out', 0, sd=1, 
                          shape=(1, 8), 
                          testval=init_out)

# Build neural-network using relu activation function
B2 = pm.Normal('bias2', 0., 1.)

act_1 = T.nnet.relu(T.dot(ann_input, weights_in_1))

act_2 = T.nnet.relu(T.dot(act_1, weights_1_2))

act_3 = T.nnet.relu(T.dot(act_2, weights_2_3) + B2)

act_4 = T.nnet.relu(T.dot(act_3, weights_3_4))

act_5 = T.nnet.relu(T.dot(act_4, weights_4_5) + B2)

act_6 = T.nnet.relu(T.dot(act_5, weights_5_6))

act_7 = T.nnet.relu(T.dot(act_6, weights_6_7) + B2)

act_8 = T.nnet.relu(T.dot(act_7, weights_7_8))

act_9 = T.nnet.relu(T.dot(act_8, weights_8_9) + B2) 

act_out = T.dot(act_8, weights_10_out)


out = pm.Normal('out', mu = act_out, observed=ann_output, shape = y_train_t.shape)

I will say this worked when using the neural net from Thomas Wiecki’s blog. Now I’m trying to directly compare results to another keras model of the same size and shape. Any idea of what i’m doing wrong?

For more information, the below keras model is what i’m trying to copy.

Layer (type) Output Shape Param #

dense (Dense) (None, 128) 6912


dropout (Dropout) (None, 128) 0


dense_1 (Dense) (None, 64) 8256


dropout_1 (Dropout) (None, 64) 0


dense_2 (Dense) (None, 32) 2080


dropout_2 (Dropout) (None, 32) 0


dense_3 (Dense) (None, 16) 528


dropout_3 (Dropout) (None, 16) 0


dense_4 (Dense) (None, 8) 136


dense_5 (Dense) (None, 1) 9

Hi Jordan-

You just need to decide which dimension are samples (in this particular case, it’s dimension 0 (rows)). All the multiplications then happen along the other dimension. When observations are along the rows, then weights are right-multiplied

(i) Y = f3(f2(f1(XW1 + b1)W2 + b2)W3 + b3) # only the column dimension can change

When observations are long the columns, the weights are left-multiplied:

(ii) Y = f3(W3f2(W2f1(W1X + b1)+b2)+b3) # only the row dimension can change

Right now your network is set up for convention (ii) but your data shape is for convention (i). So you can either change the network to convention (i) or change the first layer to the equivalent of dot(W1, transpose(input)).

Are you saying it’s as simple as switching the weights in the shapes? So on weights_2_3 instead of shape being (64, 128) it should be (128,64)?

Not quite, you may also have to change the order of the multiplication in your network. Right now it actually looks like

A1 = f(XW0)
A2 = f(A1W1)
A3 = f(A2W2)

which is all good – and convention (i). So it just looks like the weight sizes are a little off. They should be

w_1_2 ~ (53, 128) so that (161533, 53) x (53, 128) -> (161533, 128)
w_2_3 ~ (128, 64) so that (161533, 128) x (128, 64) -> (161533, 64)

and so on.

Thank you. I’ll work on this. Appreciate the help.

Got it!

ann_input = tt.shared(np.asarray(X_train))
y_train_t = Y_train.transpose()
ann_output = tt.shared(np.asarray(y_train_t))

#Initialize random weights between each layer
init_1 = np.random.randn(X_train.shape[1], 128)
init_2 = np.random.randn(128, 128)
init_3 = np.random.randn(128, 64)
init_4 = np.random.randn(64, 64)
init_5 = np.random.randn(64, 32)
init_6 = np.random.randn(32,32)
init_7 = np.random.randn(32, 16)
init_8 = np.random.randn(16, 16)
init_9 = np.random.randn(16, 8)
#init_10 = np.random.randn(n_hidden, n_hidden)
init_out = np.random.randn(8, )

with pm.Model() as neual_network:

# Weights from input to hidden layer
weights_in_1 = pm.Normal('w_in_1', 0, sd=1,
                    shape=(X_train.shape[1], 128),
                    testval=init_1)

weights_1_2 = pm.Normal('w_in_2', 0, sd=1,
                       shape = (128,128),
                       testval = init_2)

# Weights from 1st to 2nd layer
weights_2_3 = pm.Normal('w_2_3', 0, sd=1, 
                    shape=(128, 64), 
                    testval=init_3)

# Weights from 1st to 2nd layer
weights_3_4 = pm.Normal('w_3_4', 0, sd=1, 
                    shape=(64, 64), 
                    testval=init_4)

    # Weights from 1st to 2nd layer
weights_4_5 = pm.Normal('w_4_5', 0, sd=1, 
                    shape=(64, 32), 
                    testval=init_5)

        # Weights from 1st to 2nd layer
weights_5_6 = pm.Normal('w_5_6', 0, sd=1, 
                    shape=(32, 32), 
                    testval=init_6)

            # Weights from 1st to 2nd layer
weights_6_7 = pm.Normal('w_6_7', 0, sd=1, 
                    shape=(32, 16), 
                    testval=init_7)

            # Weights from 1st to 2nd layer
weights_7_8 = pm.Normal('w_7_8', 0, sd=1, 
                    shape=(16, 16), 
                    testval=init_8)

                # Weights from 1st to 2nd layer
weights_8_9 = pm.Normal('w_8_9', 0, sd=1, 
                    shape=(16, 8), 
                    testval=init_9)

# Weights from hidden layer to output
weights_10_out = pm.Normal('w_10_out', 0, sd=1, 
                      shape=(8, ), 
                      testval=init_out)

# Build neural-network using relu activation function
B2 = pm.Normal('bias2', 0., 1.)

act_1 = T.nnet.relu(T.dot(ann_input, weights_in_1)) 

act_2 = T.nnet.relu(T.dot(act_1, weights_1_2))

act_3 = T.nnet.relu(T.dot(act_2, weights_2_3) + B2)

act_4 = T.nnet.relu(T.dot(act_3, weights_3_4))

act_5 = T.nnet.relu(T.dot(act_4, weights_4_5) + B2)

act_6 = T.nnet.relu(T.dot(act_5, weights_5_6))

act_7 = T.nnet.relu(T.dot(act_6, weights_6_7) + B2)

act_8 = T.nnet.relu(T.dot(act_7, weights_7_8))

act_9 = T.nnet.relu(T.dot(act_8, weights_8_9) + B2) 

act_out = T.dot(act_9, weights_10_out)


out = pm.Normal('out', mu = act_out, observed=ann_output, shape = y_train_t.shape)