Linear regression: prediction on holdout dataset

Interesting, thank you! I wonder if your approach would also work with a hierarchical model (as described in this thread)?

    # std for likelihood
    s = pm.HalfNormal('sigma', 2.)
    # covariance matrix
    packed_L_Omega = pm.LKJCholeskyCov('packed_L_Omega', n=n_predictors, eta=1, sd_dist=pm.HalfNormal.dist(1.))
    L_Omega = pm.expand_packed_triangular(n_predictors, packed_L_Omega)
    # group mean
    mu = pm.Normal('mu', 0., 5., shape=(n_predictors, n_groups))
    # group slopes (non-centered parameterization)
    beta_base = pm.Normal('beta_base', 0., 1., shape=(n_predictors, n_groups))
    beta = pm.Deterministic('beta', mu + pm.math.dot(L_Omega, beta_base))
    t_beta = pm.Deterministic('t_beta', pm.math.dot(R_train_inv, beta))
    # group intercepts
    alpha = pm.Normal('alpha', 0., 5., shape=(n_samples, n_groups))
    # likelihood:
    # group_idx is a `(n_samples,)` array of integer group indices
    y_hat = alpha[group_idx] + (Q_train * beta[:, group_idx].T).sum(axis=-1)
    y = StudentT('y', nu=2, mu=y_hat, sd=s, observed=y_train)

The n_groups doesn’t change between train and test time. group_idx does change, though - in accordance with the new X_test. Thoughts?