Input Dimension mismatch error when modelling biases inside the HBNN tutorial

errai34 · February 27, 2019, 11:46am

Hi Everyone,

First of all, I am very happy to be part of the community here. A quick bio: I am a 25-year-old PhD student in Astrophysics at UCL in the United Kingdom. Love all things Data, and really looking forward to learning as much as possible about Bayesian stats and probabilistic programming with and from you. I think the pymc project is simply brilliant and I’d have been lost without it in my research.

Talking about research, I recently tried to adapt the tutorial on Hierarchical Bayesian Neural Networks of Thomas Wiecki HBNN tutorial to a regression problem, and as part of that I also wanted to model the biases not only the weights. My code looks something like this:

Xs = theano.shared(Xs_train)
# Xs shape: ([2, 1167, 11] : 2 groups, 1167 data entries, 11 features. Ys: 1 target only
Ys = theano.shared(Ys_train)
n_hidden = 5
n_grps = Xs.shape[0].eval()
n_data = Xs.shape[2].eval()
ntargets = 1
#Initialize random weights 
init_1 = floatX_pymc3(np.random.randn(n_data, n_hidden))
init_2 = floatX_pymc3(np.random.randn(n_hidden, n_hidden))
init_out = floatX_pymc3(np.random.randn(n_hidden))

# Initialize random biases between each layer
bias_init_1  = np.random.randn(n_hidden)
bias_out = np.random.randn(ntargets)

with pm.Model() as neural_network:

# Group mean distribution for input to hidden layer    
 
weights_in_1_grp = pm.Normal('w_in_1_grp', 0, sd=1, 
                             shape=(n_data, n_hidden), 
                             testval=init_1)

# Group standard-deviation
weights_in_1_grp_sd = pm.HalfNormal('w_in_1_grp_sd', sd=1.)

biases_in_1_grp = pm.Normal('b_in_1_grp', 0, sd = 1, shape=(n_hidden), testval = bias_init_1)
biases_in_1_grp_sd = pm.HalfNormal('b_in_1_sd', sd = 1.)
  
# Group mean distribution from hidden layer to output
weights_2_out_grp = pm.Normal('w_2_out_grp', 0, sd=1, 
                              shape=(n_hidden,), 
                              testval=init_out)

weights_2_out_grp_sd = pm.HalfNormal('w_2_out_grp_sd', sd=1.)

biases_2_out_grp = pm.Normal('b_2_out_grp', 0, sd = 1, shape=(ntargets), testval=bias_out)
biases_2_out_grp_sd = pm.HalfNormal('b_2_out_grp_sd', sd = 1.)

# Separate weights for each different model, just add a 3rd dimension
# of weights
weights_in_1_raw = pm.Normal('w_in_1', 
                                 shape=(n_grps, n_data, n_hidden))
# Non-centered specification of hierarchical model
weights_in_1 = weights_in_1_raw * weights_in_1_grp_sd + weights_in_1_grp
   
biases_in_1_raw = pm.Normal('b_in_1', shape = (n_grps, n_hidden,))
biases_in_1 = biases_in_1_raw * biases_in_1_grp_sd + biases_in_1_grp

weights_2_out_raw = pm.Normal('w_2_out', 
                                  shape=(n_grps, n_hidden))
weights_2_out = weights_2_out_raw * weights_2_out_grp_sd + weights_2_out_grp

biases_2_out_raw = pm.Normal('b_2_out', shape=(n_grps, ntargets))
biases_2_out = biases_2_out_raw * biases_2_out_grp_sd + biases_2_out_grp
 
act_1 = pm.math.tanh(tt.batched_dot(Xs, weights_in_1) + biases_in_1)

act_out = tt.batched_dot(act_1, weights_2_out)  + biases_in_2
pred = pm.Deterministic('pred', act_out)

out = pm.Normal('out', pred, observed=Ys)

with neural_network:
    # fit model
    trace_hier = pm.sample(draws = 100, chains = 1, progressbar = True)
    # sample posterior predictive
    ppc_train = pm.sample_ppc(trace_hier, samples=100, progressbar=True)

However, when I run the code, I get this error:

---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-17-1398ce5046f3> in <module>()
     65 
     66 `Preformatted text`
---> 67     act_1 = pm.math.tanh(tt.batched_dot(Xs_true, weights_in_1)+biases_in_1)
     68 
     69     act_out = tt.batched_dot(act_1, weights_2_out)
....
ValueError: Input dimension mis-match. (input[0].shape[1] = 1167, input[1].shape[1] = 2)

It may be a very silly and beginner’s mistake somewhere when creating the shape for the bias, but I thought it was pretty straightforward -> add a 2nd dimension for the groups, and make the HBNN non-centered in bias too.

Let me know if you see any obvious mistake! Thanks so much and please forgive me if the editing for this post is bad, I’ll improve too (one neural net at a time :p).

Cheers,
Jo

junpenglao · February 27, 2019, 1:18pm

What is the operation you are expecting in

tt.batched_dot(Xs, weights_in_1) + biases_in_1

?

Specifically, looking at the shape:

Xs_train.shape, weights_in_1.tag.test_value.shape, biases_in_1.tag.test_value.shape
# ==> ((2, 1167, 11), (2, 11, 5), (2, 5))

are you expecting the output to have a shape of (2, 5) or (2, 1167, 5)?

errai34 · February 27, 2019, 1:28pm

Hi Junpeng,

I think I want the output to have a shape of (2, 1167, 5). Will have to think about this a bit more…

Cheers,
Jo

junpenglao · February 27, 2019, 2:14pm

That means you need to make sure biases_in_1 got the right shape, try:

biases_in_1_raw = pm.Normal('b_in_1', shape = (n_grps, 1, n_hidden,))

errai34 · February 27, 2019, 2:53pm

Hi Junpeng,

It works! This is brilliant - thank you so much! Phew, gotta be careful about these things now!

Jo

Topic		Replies	Views
How to resolve Input Dimension Mis-match Error in Hierarchical Bayesian Inference with PyMC3 v5 theano , modeling , jax , hierarchical	3	325	December 20, 2023
Dimension mismatch in the normal likelihood with theano variables version agnostic theano , modeling	7	663	March 9, 2022
Multi-dimensional input Bayesian Neural Network modeling	0	322	February 20, 2024
Minibatch Theano: shape mismatch: Questions	0	415	May 25, 2020
"Input dimension mis-match" in basic model? Questions	6	4776	January 22, 2019

Input Dimension mismatch error when modelling biases inside the HBNN tutorial

Related topics