"Empirical" posterior for dependent variable

I am trying to find drugs that inhibit the migration of cancer cells from microscopy images. Using approximate bayesian computation plus a stochastic simulation (using migration as a simulation parameter), I have found the approximate empirical posterior for the migration parameter. A separate posterior is computed for each image (observation). Each observation is coupled with a drug, dose as well as other covariates.

My goal is to build a model using pymc that explain the changes in the migration parameter depending on the covariates. If the migration parameter was simply an observation this would be simple. It would just be a linear model. The issue here is that the dependent variable is not an observation, it is an empirical known posterior. I could find the MAP of the migration and just plug it in as an observation in the pymc model. But then I lose the uncertainty of the migration posterior.

Is there any way I can find the posterior of the covariates using the already known empirical posterior of the migration using pymc3? Perhaps by utilizing pymc potential in a clever way?

I am not an expert in bayesian computation by any means. I apologize if this is a ludicrous question. :slight_smile:


Why not use pm.Interpolated? You’ll need one of these for each sample – something like

def gen_interp(name, vals, h=0.2, hw=1.5, k=25, plot=True, shape=1):
    xmx, xmn = np.max(vals), np.min(vals)
    xrange = xmx - xmn
    x_interp = np.linspace(xmn-h*xrange, xmx+h*xrange, k)
    y_interp = np.concatenate([[0], sp.stats.gaussian_kde(vals)(x_interp), [0]])
    w = hw*xrange
    x_interp = np.concatenate([[xmn-w], x_interp, [xmx+w]])
    if plot:
        plt.plot(x_interp, y_interp)
    return pm.Interpolated(name, x_interp, y_interp, shape=shape)

with pm.Model() as mod:
    randeff = pm.Normal('random', 0., 1.)
    migration_rnd = tt.zeros(n_sample, dtype=theano.config.floatX)
    for j in range(n_sample):
        migration_rnd = tt.set_subtensor(migration_rnd[j], 
                                         gen_interp('sample_%d' % j, sample_densities[j]))
    intercept = pm.Normal('intercept', 0., 1.)
    err_sd = pm.HalfNormal('err_sd', 1.)
    y_pred = pm.Deterministic('y', migration_rnd * randeff + intercept)
    y_obs = pm.Normal('lik', mu=y_pred, sd=err_sd, observed=Y)
    trace = pm.sample(500, tune=750, chains=9, cores=3, nuts_kwargs={'target_accept': 0.9, 'max_treedepth': 20})

or full gist:

And if the samples all share the same marginal density (i.e. it’s a population estimate) you can use

migration_rnd = gen_interp('sample_effs', x_migration_density, shape=(n_sample,))

as a shortcut