Understanding shape of values returned by sample_posterior_predictive

Thanks, Alex, it works!

For anyone interested, the above “toy” model has to be changed to something like this:

with pm.Model() as mdl:        
    sigma = pm.HalfFlat('sigma',shape=1)
    beta = pm.Flat('beta',shape=(k,1))
    X = pm.Data('X', X)
    mu = tt.dot(X, beta)
    yobs = pm.Normal('yobs', mu=mu, sigma=sigma, observed=y)
trace = pm.sample(1000, tune=2000, target_accept=0.95, model=mdl)

Then out-of-sample predictions can be generated with

X_new = np.array([0,1,0,0]).reshape(1,4) # does not work without reshaping, X needs to be a row vector!
with mdl:
    pm.set_data({'X': X_new }) # key line of code
    pp = pm.sample_posterior_predictive(trace)
    post_pred = pp['yobs'][:,0]

pp['yobs'] has a shape of (400,41), so the sampling for the one observation X_new seems to be repeated 41 times. I wonder why PyMC3 is doing that.