Hi Jordan-
You just need to decide which dimension are samples (in this particular case, it’s dimension 0 (rows)). All the multiplications then happen along the other dimension. When observations are along the rows, then weights are right-multiplied
(i) Y = f3(f2(f1(XW1 + b1)W2 + b2)W3 + b3) # only the column dimension can change
When observations are long the columns, the weights are left-multiplied:
(ii) Y = f3(W3f2(W2f1(W1X + b1)+b2)+b3) # only the row dimension can change
Right now your network is set up for convention (ii) but your data shape is for convention (i). So you can either change the network to convention (i) or change the first layer to the equivalent of dot(W1, transpose(input)).