Advice on dimensionality reduction?

How could my models’ priors be quickly sampled, in their model and data context?

Use a data-less model to generate parameters, and a data-ful to estimate likelihood:

with pm.Model() as priors_only:
    μ = pm.Normal('mu', 0, 1, shape=(k))
    L_Σ_vec = pm.LKJCholeskyCov('Lvec', n=k, eta=0.5)
    sim_params = pm.sample(500)


with pm.Model() as data_model:
    μ = pm.Normal('mu', 0, 1, shape=(k))
    L_Σ_vec = pm.LKJCholeskyCov('L_vec', n=k, eta=0.5, sd_dist = pm.HalfNormal.dist(1))
    L_Σ = pm.Deterministic('L', pm.expand_packed_triangular(k, L_Σ_vec))
    X = pm.MvNormal('obs', μ, chol=L_Σ, observed=data)
    llik = data_model.logp  # this step is super important
    dliks = [llik({'mu': sim_params['mu'][i,:], 'L_vec': sim_params['L_vec'][i, :]}) for i in xrange(500)]
    # these are P(X|theta)P(theta)
    print 'mean likelihood: %f' % np.mean(dliks)

Would this just be pm.sample_prior_predictive()?

I think this would generate a matrix of stuff that “should look like” the data.

As an aside, the mean likelihood calculated above is a first-order dumb monte-carlo integration. The whole point of NUTS is to make things smarter. Really all this is doing is giving you a sense of (hand-waving) the KL divergence between the un-fit prior and the posterior of the model. From this perspective it’s difficult to distinguish between “bad model” (NO settings of parameters are particularly good) and “bad parameters” (the model is right, but your initial values stink).

A more robust alternative might be to use ADVI (which should be far faster than NUTS) to better approximate the likelihood:

with data_model:
    sim_params = pm.ADVI().fit().sample(500)
    dliks = [llik({'mu': sim_params['mu'][i,:], 'L_vec': sim_params['L_vec'][i, :]}) for i in xrange(500)]

That should at least help distinguish between poor models and poor initial prior parameters.