Hi there and thanks for all your work,
We are trying to build a Bayesian Neural Network with Multivariate Gaussians as distributions for our parameters. We are following the structure of the code presented in the tutorial
with univariate Normal assuming a factorization of the posterior.
We counter problems with the incompatibility of dimensions. Here is our code:
def construct_nn(ann_input, ann_output):
n_hidden = 5 #number of neurons in each hidden layer
mu_in_1 = np.zeros((X_train.shape[1] * n_hidden), dtype=floatX)
cov_in_1 = np.eye(X_train.shape[1] * n_hidden, dtype=floatX)
mu_1_2 = np.zeros((n_hidden * n_hidden), dtype=floatX)
cov_1_2 = np.eye(n_hidden * n_hidden, dtype=floatX)
mu_2_out = np.zeros(n_hidden, dtype=floatX)
cov_2_out = np.eye(n_hidden, dtype=floatX)
coords = {
"hidden_layer_1": np.arange(n_hidden),
"hidden_layer_2": np.arange(n_hidden),
"train_cols": np.arange(X_train.shape[1]),
# "obs_id": np.arange(X_train.shape[0]),
}
with pm.Model(coords=coords) as neural_network:
ann_input = pm.Data("ann_input", X_train, mutable=True, dims=("obs_id", "train_cols"))
ann_output = pm.Data("ann_output", Y_train, mutable=True, dims="obs_id")
# Weights from input to hidden layer
#The prior for the weights connecting the input to the first hidden layer is a normal distribution
weights_in_1 = pm.MvNormal("w_in_1", mu=mu_in_1, cov=cov_in_1, dims=("train_cols", "hidden_layer_1"))
# Weights from 1st to 2nd layer
#The prior for the weights connecting the first hidden layer to the second is a normal distribution
weights_1_2 = pm.MvNormal("w_1_2", mu=mu_1_2, cov=cov_1_2, shape=(n_hidden, n_hidden))
#The prior for the weights connecting the second hidden layer to the output layer is a normal distribution
# Weights from hidden layer to output
weights_2_out = pm.MvNormal("w_2_out", mu=mu_2_out, cov=cov_2_out, dims="hidden_layer_2")
# Build neural-network using tanh activation function
act_1 = pm.math.tanh(pm.math.dot(ann_input, weights_in_1))
act_2 = pm.math.tanh(pm.math.dot(act_1, weights_1_2))
act_out = pm.math.sigmoid(pm.math.dot(act_2, weights_2_out))
# Binary classification -> Bernoulli likelihood
#The likelihood of the model is a Bernoulli distribution, it is used when calling pm.sample() or pm.fit()
out = pm.Bernoulli(
"out",
act_out, #The probability of success for each trial (i.e., the probability of outputting a 1)
observed=ann_output, #The observed data
total_size=Y_train.shape[0], # IMPORTANT for minibatches
dims="obs_id",
)
return neural_network
neural_network = construct_nn(X_train, Y_train)
The Neural network gets defined but when we train it using this code, we face the following error:
%%time
with neural_network:
approx = pm.fit(n=30_000) # approx represents the variational approximation to the posterior distribution
ValueError: Incompatible Elemwise input shapes [(None, 5), (None, 25)]