The two examples are different because you didn’t add beta[0] (the constant) to your dot product.
I had a question about the shape of your beta. You use len(df.columns), which I don’t know the shape of. A priori I would write your model like this:
with pm.Model() as model:
X = pm.MutableData('X', X_train)
y = pm.MutableData('y', y_train)
beta = pm.Normal('beta', mu=0, sigma=5, shape=len(X_train.columns))
constant = pm.Normal('constant', mu=0, sigma=5)
p = pm.Deterministic('p', pm.math.sigmoid(constant + X @ beta))
observed = pm.Bernoulli("obs", p, observed=y_train)
The shape of beta then matches the the thing I’ll be dotting it with, and I handle the constant separately (or add a column of 1’s to X).