Could this possibly be broadcasting? What if you set
y_train = y_train.reshape((y_train.shape[0],1))
See Memory issues with creating simple regression model