I’ve been playing around with the framework here quite a bit. I love SVGD but it has some performance issues – first, it can be brutally slow with a moderate number of particles, and second the temperature parameter really impacts the posterior estimates. See
https://ferrine.github.io/blog/2017/06/04/asvgd-sanity-check/
What SVGD does really well is track particles to posterior modes – and as far as I’m concerned this is the most substantial challenge in Bayesian modeling. If you know where the modes are and have just an inkling of the local density, one or two iterations of importance resampling can recover the posterior to a high degree of accuracy.
I’ve thrown together some code to tack on KDE-based resampling to an initial model sample:
def sample_unnormalized(p_prop, size):
p_cum = np.cumsum(p_prop)
unis = np.random.uniform(size=size) * p_cum[-1]
return np.greater.outer(unis, p_cum).sum(axis=1)
def sample_kde(dataset, n):
bw = dataset.shape[0] ** (-1./2.)
# deal with numerics a little bit for high-dimensional settings
kde_cov = bw * (np.cov(dataset.T) + np.diag((bw ** 2) * np.ones(dataset.shape[1])))
kde_L = np.linalg.cholesky(kde_cov)
idx = np.random.choice(dataset.shape[0], n, replace=True)
return dataset[idx,:] + np.dot(np.random.normal(size=(n, dataset.shape[1])), kde_L.T)
def kde_isrs_(X, logp, N, n):
Snew = sample_kde(X, N)
liks = np.array([logp(Snew[i,:]) for i in xrange(N)])
probs = np.exp(liks - np.max(liks))
return Snew[sample_unnormalized(probs, n), :]
def kde_isrs(X, logp, N, n, iters=2):
for j in xrange(iters):
X = kde_isrs_(X, logp, N, n)
return X
def vec2vardict(fit, vec, vars_):
vd, offset = dict(), 0
for vname in vars_:
vcols = fit[vname].shape[1] if len(fit[vname].shape) > 1 else 1
vd[vname] = vec[offset:(offset+vcols)]
if len(vd[vname]) == 1:
vd[vname] = vd[vname][0]
offset += vcols
return vd
def kde_isrs_model(model, model_fit, required_vars, N, n, iters=2):
X = np.hstack([pd.DataFrame(model_fit[v]) for v in required_vars])
logp = model.logp
def model_lik(Xvec):
vd = vec2vardict(model_fit, Xvec, required_vars)
return logp(**vd)
Z = kde_isrs(X, model_lik, N, n, iters)
vardict, offset = dict(), 0
for vname in required_vars:
vcols = svgd_fit[vname].shape[1] if len(model_fit[vname].shape) > 1 else 1
vardict[vname] = Z[:, offset:(offset+vcols)]
offset += vcols
return vardict
And its application to the Variational API Quickstart
isrs_res = kde_isrs_model(model, svgd_approx.sample(50), ['x'], 5000, 1000, 1)
ax = sns.kdeplot(trace['x'], label='NUTS');
sns.kdeplot(approx.sample(10000)['x'], label='ADVI');
sns.kdeplot(svgd_approx.sample(2000)['x'], label='SVGD');
sns.kdeplot(isrs_res['x'][:,0], label='SVGD+KDE-ISRS')
My issue is that this is entirely manual (outside of pymc3). The Interpolation trick used here is not particularly happy for higher-dimensions. Is there a pymc3 approach to this?