Dimension mismatch in the normal likelihood with theano variables

Hello,

I am carrying out Bayesian regression and want to predict on the out-of-sample data. For this, I’m using pm.Data() command to set the input training data.

I notice that I encounter dimension mismatch in the likelihood function if I use theano shared variables. However, this is not the case when I use numpy arrays.
Please refer the below code (I hope it’s clear enough)

    with pm.Model() as model:
        X1 = pm.Data('X1', x1) # shape=(1,26)
        X2= pm.Data('X2', x2) # shape= (1,26)
        
        # shape of y_tr = (26,26)
        training_data = pm.Data('y_train', y_tr)
       
        # this "should" return a vector of dim (1,26)
        mu_y = A * X1 + CustomFunction(a_,b_,c_,X2) 

        sigma_y = pm.InverseGamma("sigma_y",1,1, shape=(1,26))

        # Likelihood
        Y_obs = pm.Normal('Y_obs', mu_y, sigma_y, observed=training_data)

Error: Input dimension mis-match (input[0].shape[0] = 26, input[1].shape[0] = 1)

Any leads? Thanks.

Can you provide the full error message?

Do you get the error as soon as you specify the likelihood? Can you print x1 and x2.broadcastable?

Hello @cluhmann, here’s the full error message. I have already cross-checked the dimension of each variable several times and they are exactly the same as mentioned in the problem description. Thanks for the help.

> File "C:/User/U674666/Project/Metamodelling-and-Optimization/src/PyMC3/AERO/Bayesian_Model.py", line 165, in <module>
>     Y_obs = pm.Normal('Y_obs', mu_y, sigma_y, observed=training_data)
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\pymc3\distributions\distribution.py", line 122, in __new__
>     return model.Var(name, dist, data, total_size, dims=dims)
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\pymc3\model.py", line 1176, in Var
>     var = ObservedRV(
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\pymc3\model.py", line 1812, in __init__
>     self.logp_elemwiset = distribution.logp(data)
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\pymc3\distributions\continuous.py", line 536, in logp
>     return bound((-tau * (value - mu) ** 2 + tt.log(tau / np.pi / 2.0)) / 2.0, sigma > 0)
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\theano\tensor\var.py", line 120, in __sub__
>     return theano.tensor.basic.sub(self, other)
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\theano\graph\op.py", line 253, in __call__
>     compute_test_value(node)
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\theano\graph\op.py", line 130, in compute_test_value
>     required = thunk()
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\theano\graph\op.py", line 606, in rval
>     thunk()
>   File "C:\Users\U674666\Anaconda3\envs\env_py39_full\lib\site-packages\theano\link\c\basic.py", line 1771, in __call__
>     raise exc_value.with_traceback(exc_trace)
> ValueError: Input dimension mis-match. (input[0].shape[0] = 26, input[1].shape[0] = 1)

Hello @ricardoV94, yes, I get the error message where the likelihood is defined. I believe it’s a fairly simple problem as far as the modelling point is concerned. There’s a mean variable mu_y (1,26), there’s a sigma_y (1,26) that represents the behavior about each mean value and then there’s the observed data training_data (26,26). I want to know why setting pm.Data() is causing problems while defining the likelihood.

Here’s the result for both the variables X1 and X2:

print(X1.broadcastable)
(False, False)
print(X2.broadcastable)
(False, False)

Thanks for helping out.

The problem is that broadcastable. You have to tell Theano that the first dimension of data is broadcastable (size 1). I don’t know if in your version of PyMC you can already pass broadcastable=(True, False) when creating x1 and x2. You can in the new beta version of PyMC for sure, and if not in yours, you can ue theno.shared directly (pm.Data is just a wrapper around shared)

1 Like

Hey @ricardoV94, thanks a lot. That was the problem indeed, which got resolved (I believe) by using theano.shared() with broadcastable argument.

For the same model now, I’m stuck at a rather different problem where the error is pymc3.exceptions.SamplingError: Initial evaluation of model at starting point failed! Initial evaluation results: a_ -19.33
b__log__ 1.02
mu_c_ -1.35
c_ -1.35
A -23.89
sigma_y_log__ -33.98
Y_obs -inf
Name: Log-probability of test_point, dtype: float64

Could you please have a look at the code below and, may be, suggest what could have gone wrong with this model?

    with pm.Model() as model:

        X1 = theano.shared(x1, broadcastable=(True,False))
        X2= theano.shared(x2, broadcastable=(True,False))
        
        # Observed data shape = (26,26)
        training_data = theano.shared(y_train_)

        numG = 6

        def sumGaussians(a,b,c,f):
            sum = tt.zeros (shape=(1,26))
            for g in range(numG):
                sum = sum + a[g] * pm.math.exp(-1.0 * pm.math.sqr((f - b[g])/c[g]))
            return sum

        # Height of the Gaussians
        a_ = pm.Normal("a_", mu=50, sigma=10, shape=(numG, 1))

        # Centres of the Gaussians
        mu_b_ = 3.0
        sigma_b_ = 1.0
        alpha_b_ = mu_b_ ** 2.0 / sigma_b_
        beta_b_ = mu_b_ / sigma_b_
        b_ = pm.Gamma("b_", alpha=alpha_b_, beta=beta_b_, shape=(numG, 1))

        # Width of the Gaussians
        mu_c_ = pm.Normal("mu_c_", mu=3, sigma=0.5, shape=(numG, 1))
        c_ = pm.Normal("c_", mu=mu_c_, sigma=0.5, shape=(numG, 1))

        A = pm.Normal("A", mu=10, sigma=1, shape=(1,26))

        # The mean of the likelihood is assumed to be a combination of `X1` and `X2`
        temp1 = tt.mul(A ,tt.log(tt.power(X1, 6.0) / 0.001)) # shape = (1,26)
        temp2 = sumGaussians(a_,b_,c_,X2) # shape = (1,26)
        temp3 = tt.stack([temp1, temp2], axis=0) # shape = (2,1,26)
        
        # shape = (1,26)
        mu_y = pm.Deterministic("mu_y", tt.mean(temp3, axis=0))

        # Noise in the data
        sigma_y = pm.InverseGamma("sigma_y",1,1, shape=(1,26))

        # Likelihood
        Y_obs = pm.Normal('Y_obs', mu_y, sigma_y, observed=training_data)

        trace = pm.sample(
            draws=1000,
            tune=1000,
            init="adapt_diag",
            chains=2,
            cores=2,
            target_accept=0.95
        )

Thanks again!

1 Like

I might suggest opening a new topic (with this topic marked as “solved”, people may be less likely to read/respond).

1 Like