Thanks, Alex, it works!
For anyone interested, the above “toy” model has to be changed to something like this:
with pm.Model() as mdl:
sigma = pm.HalfFlat('sigma',shape=1)
beta = pm.Flat('beta',shape=(k,1))
X = pm.Data('X', X)
mu = tt.dot(X, beta)
yobs = pm.Normal('yobs', mu=mu, sigma=sigma, observed=y)
trace = pm.sample(1000, tune=2000, target_accept=0.95, model=mdl)
Then out-of-sample predictions can be generated with
X_new = np.array([0,1,0,0]).reshape(1,4) # does not work without reshaping, X needs to be a row vector!
with mdl:
pm.set_data({'X': X_new }) # key line of code
pp = pm.sample_posterior_predictive(trace)
post_pred = pp['yobs'][:,0]
pp['yobs'] has a shape of (400,41), so the sampling for the one observation X_new seems to be repeated 41 times. I wonder why PyMC3 is doing that.