My confusion then becomes, why does this logic not extend to the binary response case? If you see the first code block
z_studenthas shape(1,students)whilez_questionhas shape(questions,1).
OK, I think I see - if you have shapes (1, m) and (n, 1), then numpy will do the broadcasting for you:
>>> a = np.random.randn(1, 5)
>>> b = np.random.randn(3, 1)
>>> (a - b).shape
(3, 5)
On the other hand, if you have shapes (k, m) and (n, k), then it won’t:
>>> a = np.random.randn(2, 5)
>>> b = np.random.randn(3, 2)
>>> (a - b).shape
ValueError: operands could not be broadcast together with shapes (2,5) (3,2)
So perhaps first figure out what shape you expect a-b to be, then reshape them accordingly so that the subtraction works