Still having troubles with neural net shapes

jordan.howell2 · February 27, 2019, 5:41pm

Hello,

I’ve been working with a bayesian neural net and while I thought I had it, it seems not the case.

This is the error I’m gettiing.

ValueError: shapes (161533,53) and (128,53) not aligned: 53 (dim 1) != 128 (dim 0)

The shape of my input is (161533, 53)

My model is as follows:

ann_input = tt.shared(np.asarray(X_train))
y_train_t = Y_train.transpose()
ann_output = tt.shared(np.asarray(y_train_t))

#n_hidden = 10

Initialize random weights between each layer

init_1 = np.random.randn(128, X_train.shape[1])
init_2 = np.random.randn(128, 128)
init_3 = np.random.randn(64, 128)
init_4 = np.random.randn(64, 64)
init_5 = np.random.randn(32, 64)
init_6 = np.random.randn(32,32)
init_7 = np.random.randn(16, 32)
init_8 = np.random.randn(16, 16)
init_9 = np.random.randn(8, 16)
#init_10 = np.random.randn(n_hidden, n_hidden)
init_out = np.random.randn(1, 8)

with pm.Model() as neual_network:
# Weights from input to hidden layer
weights_in_1 = pm.Normal(‘w_in_1’, 0, sd=1,
shape=(128, X_train.shape[1]),
testval=init_1)

# Weights from 1st to 2nd layer
weights_1_2 = pm.Normal('w_1_2', 0, sd=1, 
                        shape=(128, 128), 
                        testval=init_2)

    # Weights from 1st to 2nd layer
weights_2_3 = pm.Normal('w_2_3', 0, sd=1, 
                        shape=(64, 128), 
                        testval=init_3)

    # Weights from 1st to 2nd layer
weights_3_4 = pm.Normal('w_3_4', 0, sd=1, 
                        shape=(64, 64), 
                        testval=init_4)

        # Weights from 1st to 2nd layer
weights_4_5 = pm.Normal('w_4_5', 0, sd=1, 
                        shape=(32, 64), 
                        testval=init_5)

            # Weights from 1st to 2nd layer
weights_5_6 = pm.Normal('w_5_6', 0, sd=1, 
                        shape=(32, 32), 
                        testval=init_6)

                # Weights from 1st to 2nd layer
weights_6_7 = pm.Normal('w_6_7', 0, sd=1, 
                        shape=(16, 32), 
                        testval=init_7)

                # Weights from 1st to 2nd layer
weights_7_8 = pm.Normal('w_7_8', 0, sd=1, 
                        shape=(16, 16), 
                        testval=init_8)

                    # Weights from 1st to 2nd layer
weights_8_9 = pm.Normal('w_8_9', 0, sd=1, 
                        shape=(8, 16), 
                        testval=init_9)


# Weights from hidden layer to output
weights_10_out = pm.Normal('w_10_out', 0, sd=1, 
                          shape=(1, 8), 
                          testval=init_out)

# Build neural-network using relu activation function
B2 = pm.Normal('bias2', 0., 1.)

act_1 = T.nnet.relu(T.dot(ann_input, weights_in_1))

act_2 = T.nnet.relu(T.dot(act_1, weights_1_2))

act_3 = T.nnet.relu(T.dot(act_2, weights_2_3) + B2)

act_4 = T.nnet.relu(T.dot(act_3, weights_3_4))

act_5 = T.nnet.relu(T.dot(act_4, weights_4_5) + B2)

act_6 = T.nnet.relu(T.dot(act_5, weights_5_6))

act_7 = T.nnet.relu(T.dot(act_6, weights_6_7) + B2)

act_8 = T.nnet.relu(T.dot(act_7, weights_7_8))

act_9 = T.nnet.relu(T.dot(act_8, weights_8_9) + B2) 

act_out = T.dot(act_8, weights_10_out)


out = pm.Normal('out', mu = act_out, observed=ann_output, shape = y_train_t.shape)

I will say this worked when using the neural net from Thomas Wiecki’s blog. Now I’m trying to directly compare results to another keras model of the same size and shape. Any idea of what i’m doing wrong?

For more information, the below keras model is what i’m trying to copy.

Layer (type) Output Shape Param #

dense (Dense) (None, 128) 6912

dropout (Dropout) (None, 128) 0

dense_1 (Dense) (None, 64) 8256

dropout_1 (Dropout) (None, 64) 0

dense_2 (Dense) (None, 32) 2080

dropout_2 (Dropout) (None, 32) 0

dense_3 (Dense) (None, 16) 528

dropout_3 (Dropout) (None, 16) 0

dense_4 (Dense) (None, 8) 136

dense_5 (Dense) (None, 1) 9

chartl · February 27, 2019, 6:47pm

Hi Jordan-

You just need to decide which dimension are samples (in this particular case, it’s dimension 0 (rows)). All the multiplications then happen along the other dimension. When observations are along the rows, then weights are right-multiplied

(i) Y = f3(f2(f1(XW1 + b1)W2 + b2)W3 + b3) # only the column dimension can change

When observations are long the columns, the weights are left-multiplied:

(ii) Y = f3(W3f2(W2f1(W1X + b1)+b2)+b3) # only the row dimension can change

Right now your network is set up for convention (ii) but your data shape is for convention (i). So you can either change the network to convention (i) or change the first layer to the equivalent of dot(W1, transpose(input)).

jordan.howell2 · February 27, 2019, 7:11pm

Are you saying it’s as simple as switching the weights in the shapes? So on weights_2_3 instead of shape being (64, 128) it should be (128,64)?

chartl · February 27, 2019, 7:28pm

Not quite, you may also have to change the order of the multiplication in your network. Right now it actually looks like

A1 = f(XW0)
A2 = f(A1W1)
A3 = f(A2W2)

which is all good – and convention (i). So it just looks like the weight sizes are a little off. They should be

w_1_2 ~ (53, 128) so that (161533, 53) x (53, 128) -> (161533, 128)
w_2_3 ~ (128, 64) so that (161533, 128) x (128, 64) -> (161533, 64)

and so on.

jordan.howell2 · February 27, 2019, 7:31pm

Thank you. I’ll work on this. Appreciate the help.

jordan.howell2 · February 28, 2019, 8:55am

Got it!

ann_input = tt.shared(np.asarray(X_train))
y_train_t = Y_train.transpose()
ann_output = tt.shared(np.asarray(y_train_t))

#Initialize random weights between each layer
init_1 = np.random.randn(X_train.shape[1], 128)
init_2 = np.random.randn(128, 128)
init_3 = np.random.randn(128, 64)
init_4 = np.random.randn(64, 64)
init_5 = np.random.randn(64, 32)
init_6 = np.random.randn(32,32)
init_7 = np.random.randn(32, 16)
init_8 = np.random.randn(16, 16)
init_9 = np.random.randn(16, 8)
#init_10 = np.random.randn(n_hidden, n_hidden)
init_out = np.random.randn(8, )

with pm.Model() as neual_network:

# Weights from input to hidden layer
weights_in_1 = pm.Normal('w_in_1', 0, sd=1,
                    shape=(X_train.shape[1], 128),
                    testval=init_1)

weights_1_2 = pm.Normal('w_in_2', 0, sd=1,
                       shape = (128,128),
                       testval = init_2)

# Weights from 1st to 2nd layer
weights_2_3 = pm.Normal('w_2_3', 0, sd=1, 
                    shape=(128, 64), 
                    testval=init_3)

# Weights from 1st to 2nd layer
weights_3_4 = pm.Normal('w_3_4', 0, sd=1, 
                    shape=(64, 64), 
                    testval=init_4)

    # Weights from 1st to 2nd layer
weights_4_5 = pm.Normal('w_4_5', 0, sd=1, 
                    shape=(64, 32), 
                    testval=init_5)

        # Weights from 1st to 2nd layer
weights_5_6 = pm.Normal('w_5_6', 0, sd=1, 
                    shape=(32, 32), 
                    testval=init_6)

            # Weights from 1st to 2nd layer
weights_6_7 = pm.Normal('w_6_7', 0, sd=1, 
                    shape=(32, 16), 
                    testval=init_7)

            # Weights from 1st to 2nd layer
weights_7_8 = pm.Normal('w_7_8', 0, sd=1, 
                    shape=(16, 16), 
                    testval=init_8)

                # Weights from 1st to 2nd layer
weights_8_9 = pm.Normal('w_8_9', 0, sd=1, 
                    shape=(16, 8), 
                    testval=init_9)

# Weights from hidden layer to output
weights_10_out = pm.Normal('w_10_out', 0, sd=1, 
                      shape=(8, ), 
                      testval=init_out)

# Build neural-network using relu activation function
B2 = pm.Normal('bias2', 0., 1.)

act_1 = T.nnet.relu(T.dot(ann_input, weights_in_1)) 

act_2 = T.nnet.relu(T.dot(act_1, weights_1_2))

act_3 = T.nnet.relu(T.dot(act_2, weights_2_3) + B2)

act_4 = T.nnet.relu(T.dot(act_3, weights_3_4))

act_5 = T.nnet.relu(T.dot(act_4, weights_4_5) + B2)

act_6 = T.nnet.relu(T.dot(act_5, weights_5_6))

act_7 = T.nnet.relu(T.dot(act_6, weights_6_7) + B2)

act_8 = T.nnet.relu(T.dot(act_7, weights_7_8))

act_9 = T.nnet.relu(T.dot(act_8, weights_8_9) + B2) 

act_out = T.dot(act_9, weights_10_out)


out = pm.Normal('out', mu = act_out, observed=ann_output, shape = y_train_t.shape)

Topic		Replies	Views
Error On Bayesian Neural Network Questions	2	577	February 15, 2019
ValueError: shapes not aligned in basic model Questions theano	3	13150	October 20, 2021
Bayesian Neural Network - No convergence Questions	7	1403	July 8, 2019
Multi-dimensional input Bayesian Neural Network modeling	0	322	February 20, 2024
Shape mismatch sample posterior predictive with Binomial v5 shape_issue	2	316	November 15, 2023

Still having troubles with neural net shapes

Initialize random weights between each layer

Layer (type) Output Shape Param #

Related topics