Input Dimension mismatch error when modelling biases inside the HBNN tutorial

Hi Everyone,

First of all, I am very happy to be part of the community here. A quick bio: I am a 25-year-old PhD student in Astrophysics at UCL in the United Kingdom. Love all things Data, and really looking forward to learning as much as possible about Bayesian stats and probabilistic programming with and from you. I think the pymc project is simply brilliant and I’d have been lost without it in my research.

Talking about research, I recently tried to adapt the tutorial on Hierarchical Bayesian Neural Networks of Thomas Wiecki HBNN tutorial to a regression problem, and as part of that I also wanted to model the biases not only the weights. My code looks something like this:

Xs = theano.shared(Xs_train)
# Xs shape: ([2, 1167, 11] : 2 groups, 1167 data entries, 11 features. Ys: 1 target only
Ys = theano.shared(Ys_train)
n_hidden = 5
n_grps = Xs.shape[0].eval()
n_data = Xs.shape[2].eval()
ntargets = 1
#Initialize random weights 
init_1 = floatX_pymc3(np.random.randn(n_data, n_hidden))
init_2 = floatX_pymc3(np.random.randn(n_hidden, n_hidden))
init_out = floatX_pymc3(np.random.randn(n_hidden))

# Initialize random biases between each layer
bias_init_1  = np.random.randn(n_hidden)
bias_out = np.random.randn(ntargets)

with pm.Model() as neural_network:

# Group mean distribution for input to hidden layer    
 
weights_in_1_grp = pm.Normal('w_in_1_grp', 0, sd=1, 
                             shape=(n_data, n_hidden), 
                             testval=init_1)

# Group standard-deviation
weights_in_1_grp_sd = pm.HalfNormal('w_in_1_grp_sd', sd=1.)

biases_in_1_grp = pm.Normal('b_in_1_grp', 0, sd = 1, shape=(n_hidden), testval = bias_init_1)
biases_in_1_grp_sd = pm.HalfNormal('b_in_1_sd', sd = 1.)
  
# Group mean distribution from hidden layer to output
weights_2_out_grp = pm.Normal('w_2_out_grp', 0, sd=1, 
                              shape=(n_hidden,), 
                              testval=init_out)

weights_2_out_grp_sd = pm.HalfNormal('w_2_out_grp_sd', sd=1.)

biases_2_out_grp = pm.Normal('b_2_out_grp', 0, sd = 1, shape=(ntargets), testval=bias_out)
biases_2_out_grp_sd = pm.HalfNormal('b_2_out_grp_sd', sd = 1.)

# Separate weights for each different model, just add a 3rd dimension
# of weights
weights_in_1_raw = pm.Normal('w_in_1', 
                                 shape=(n_grps, n_data, n_hidden))
# Non-centered specification of hierarchical model
weights_in_1 = weights_in_1_raw * weights_in_1_grp_sd + weights_in_1_grp
   
biases_in_1_raw = pm.Normal('b_in_1', shape = (n_grps, n_hidden,))
biases_in_1 = biases_in_1_raw * biases_in_1_grp_sd + biases_in_1_grp

weights_2_out_raw = pm.Normal('w_2_out', 
                                  shape=(n_grps, n_hidden))
weights_2_out = weights_2_out_raw * weights_2_out_grp_sd + weights_2_out_grp

biases_2_out_raw = pm.Normal('b_2_out', shape=(n_grps, ntargets))
biases_2_out = biases_2_out_raw * biases_2_out_grp_sd + biases_2_out_grp
 
act_1 = pm.math.tanh(tt.batched_dot(Xs, weights_in_1) + biases_in_1)

act_out = tt.batched_dot(act_1, weights_2_out)  + biases_in_2
pred = pm.Deterministic('pred', act_out)

out = pm.Normal('out', pred, observed=Ys)

with neural_network:
    # fit model
    trace_hier = pm.sample(draws = 100, chains = 1, progressbar = True)
    # sample posterior predictive
    ppc_train = pm.sample_ppc(trace_hier, samples=100, progressbar=True) 

However, when I run the code, I get this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-1398ce5046f3> in <module>()
     65 
     66 `Preformatted text`
---> 67     act_1 = pm.math.tanh(tt.batched_dot(Xs_true, weights_in_1)+biases_in_1)
     68 
     69     act_out = tt.batched_dot(act_1, weights_2_out)
....
ValueError: Input dimension mis-match. (input[0].shape[1] = 1167, input[1].shape[1] = 2)

It may be a very silly and beginner’s mistake somewhere when creating the shape for the bias, but I thought it was pretty straightforward -> add a 2nd dimension for the groups, and make the HBNN non-centered in bias too.

Let me know if you see any obvious mistake! Thanks so much and please forgive me if the editing for this post is bad, I’ll improve too (one neural net at a time :p).

Cheers,
Jo

1 Like

What is the operation you are expecting in

tt.batched_dot(Xs, weights_in_1) + biases_in_1

?

Specifically, looking at the shape:

Xs_train.shape, weights_in_1.tag.test_value.shape, biases_in_1.tag.test_value.shape
# ==> ((2, 1167, 11), (2, 11, 5), (2, 5))

are you expecting the output to have a shape of (2, 5) or (2, 1167, 5)?

Hi Junpeng,

I think I want the output to have a shape of (2, 1167, 5). Will have to think about this a bit more…

Cheers,
Jo

That means you need to make sure biases_in_1 got the right shape, try:

biases_in_1_raw = pm.Normal('b_in_1', shape = (n_grps, 1, n_hidden,))
1 Like

Hi Junpeng,

It works! This is brilliant - thank you so much! Phew, gotta be careful about these things now!

Jo

1 Like