Did you check the trace? I am getting the same result from Model 1 (also much faster with NUTS):
This is the code I am using (small rewrite so it is more compact):
df = pd.read_csv('mixture.csv')
X = df['X'].values[:, None]
Y = df['Y'].values
k = 2
with pm.Model() as Mixture_regression1:
# Priors for weight parameter
π = pm.Dirichlet('π', np.array([1]*k), testval=np.ones(k)/k)
# Priors for unknown model parameters
α = pm.Normal('α', mu=0, sd=100, shape=(1, k)) #Intercept
β = pm.Normal('β', mu=0, sd=100, shape=(1, k))
σ = pm.HalfCauchy('σ', 5, shape=k) #Noise
mu = α + β*X
likelihood = pm.NormalMixture('likelihood', π, mu, sd=σ, observed=Y)
trace = pm.sample(1000, tune=1000)