Difference between shape = (2,1) & shape = 2

Gumibear_Toronto · September 28, 2023, 3:44am

Hi,

Recently, I felt something very weird here. I am doing some toy examples for code verification. Here, I believed that two models should return the same results. But, the only second model returns me correct results (the first model gives me posterior distributions around 0)… The only difference between two models are whether I implied shape as a vector or int. I really want to know why such things happen. Could someone please explain the reason for the situation?

    with pm.Model() as model1: 
      n_model_1 = df_x.shape[1]

    sigma_ard = pm.Gamma("sigma_ard", alpha = 2, beta =  0.01, shape = n_model_1)
    beta = pm.Normal("beta", mu = 0, sigma = sigma_ard, shape = n_model_1)
    sigma_ard0 = pm.Gamma("sigma_ard0", alpha = 2, beta = 0.01)
    beta0 = pm.Normal("beta0", mu = 0, sigma = sigma_ard0)
    sigma = pm.Gamma("sigma", alpha = 2, beta = 0.01)
    
    y = pm.Normal("y", mu = pm.math.dot(df_x, beta) + beta0, sigma = sigma, observed = df_y)

    sample = pm.sample()

    az.plot_posterior(sample, var_names= "beta");

    with pm.Model() as model2: 
    n_model_1 = df_x.shape[1]

    sigma_ard = pm.Gamma("sigma_ard", alpha = 2, beta =  0.01, shape = (n_model_1,1))
    beta = pm.Normal("beta", mu = 0, sigma = sigma_ard, shape = (n_model_1,1))
    sigma_ard0 = pm.Gamma("sigma_ard0", alpha = 2, beta = 0.01)
    beta0 = pm.Normal("beta0", mu = 0, sigma = sigma_ard0)
    sigma = pm.Gamma("sigma", alpha = 2, beta = 0.01)
    
    y = pm.Normal("y", mu = pm.math.dot(df_x, beta) + beta0, sigma = sigma, observed = df_y)

    sample = pm.sample()

az.plot_posterior(sample, var_names= "beta");

daniel-saunders-phil · September 29, 2023, 9:56pm

Hi, that’s a really good question. It’s part of the broadcasting rules built into systems like numpy and pytensor. Here’s a weird thing about numpy:

np.array([3,4]) + np.array([[7],[6]])

yields:

array([[10, 11],
       [ 9, 10]])

Because it adds 7 to [3,4] for the top row and 6 to [3,4] for the bottom row. But in regular linear algebra, that’s not a valid operation. You can only add vectors of the same shapes. At some point the numpy/matlab projects decided to implicitly add dimensions to make these valid operations.

The same broadcasting rules give rise to your problem. Suppose my data matrix is 5x2 and my coefficient vector is 2x1:

df_x = np.random.normal(size=(5,2))
beta = np.array([[2],[3]])

Then my output vector is 5x1

df_y = df_x @ beta

It is important that I preserve these same shape requirements when I code the pymc model. When I use shape = 2, I lose those shape constraints. Mu will turn out to be a (5,) vector but df_y is a (5,1).

  with pm.Model() as model1: 
      beta = pm.Normal("beta", mu = 0, sigma = 1, shape = 2)
  
      mu = pm.math.dot(df_x, beta)
      
      y = pm.Normal("y", mu = mu, observed=df_y)

If you try to draw from y, it will throw an error saying a (5,5) is not compatible with your observations.

pm.draw(y)

Where did the (5,5) come from? The same funny broadcasting rules. If you take the (5,) mu vector and add a (5,1) vector of random noise, you get the (5,5) matrix. Yet we told pymc that the output of y should look like df_y, a (5,1). So it throws the error.

If you try flattening df_y into a (5,) vector the problem goes away. So you’ll just want to stay consistent, either working with column vectors everywhere or flatten vectors everywhere.

Gumibear_Toronto · October 2, 2023, 9:44pm

That clarified my problem so clearly. Thanks for your detailed explanation!

Best,

Jinyoung

Topic		Replies	Views
Using shape parameter vs using specifying it using a different model altogether Questions	5	2000	August 29, 2018
Can you specify different parameters for a vector of priors? Questions	9	1195	April 6, 2022
Understanding dimensions/shapes of variables v5	5	1361	August 29, 2023
How to match shapes with one RV and one matrix version agnostic modeling	2	221	February 23, 2024
Dimension error for Hierarchical model version agnostic	11	384	April 13, 2023

Difference between shape = (2,1) & shape = 2

Related topics