Make predictions with the particles of SVGD

robyz · October 9, 2018, 11:06am

Hi,

In pymc3, how can I make predictions by averaging the particles (models) if I use Stein variational gradient descent for Bayesian neural nets?

In the tutorial, I found the code for GMM:

approx = pm.fit(method=pm.SVGD(n_particles=200, jitter=1.))
trace = approx.sample(10000)

And I also wonder what distribution the second line is sampling. I am confused because unlike ADVI, “approx” in SVGD should return a set of particles instead of a distribution.

best,
Rob

junpenglao · November 3, 2018, 7:43am

I think it works like using a trace to make prediction. @ferrine?

ferrine · November 3, 2018, 11:09am

Yes, @junpenglao is right. We form a kind of trace (empirical distribution) from what we sample particles independently. Thank you for pinging me in this thread. I’ve prepared a gist answering your question.

gist.github.com

https://gist.github.com/ferrine/57c12e2cd99eb21fa18568e3befa15fb

pymc3-svgd-averaging.ipynb

{
 "cells": [
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pymc3 as pm\n",
    "import numpy as np\n",

This file has been truncated. show original

This is a bit toy example (I had few time), but the code is mostly model agnostic and shows potential problems of 2 approaches to do “model averaging”

ferrine · November 3, 2018, 11:12am

If you still have any question (maybe on internals), I’m happy to answer

robyz · November 3, 2018, 12:44pm

Thank for the examples. I still have few questions:

we set the number of particles to be 100 in
approx = pm.fit(method=‘svgd’, inf_kwargs={‘n_particles’: 100}),
For model averaging:
Would trace = approx.sample(100) return exactly the values of those particles that I have trained?

But what mout1 = approx.sample_node(out, 10000).mean() is doing?
Drawing 10000 samples from a distribution then average them? so what distribution we are sampling here?

Furthermore, In Bayesian neural net, I want to average the prediction made by the particles(models) that I initialized in training, rather than used the particles to create another distribution from which I draw MC samples

ferrine · November 5, 2018, 9:06pm

Would trace = approx.sample(100) return exactly the values of those particles that I have trained?

No it will sample from particles uniformly

But what mout1 = approx.sample_node(out, 10000).mean() is doing?

It samples from the empirical distribution and makes appropriate replacements in the graph. So every Distribution node is replaced with it’s posterior distribution (symbolically).

I want to average the prediction made by the particles(models)

This was not implemented. Now I see that it might be useful to integrate over the posterior. The way you can integrate over the posterior looks like

# based mostly on https://github.com/pymc-devs/pymc3/blob/master/pymc3/variational/opvi.py#L1098
# or https://github.com/pymc-devs/pymc3/blob/master/pymc3/variational/opvi.py#L1470
def integrate_over_histogram(approx, node):
    if not isinstance(approx, pm.Empirical):
        raise ValueError('You need empirical distribution here, got {}'.format(type(approx)))
    node = approx.to_flat_input(node)
    def sample(post):
        return theano.clone(node, {approx.input: post})
    nodes, _ = theano.scan(sample, approx.histogram)
    return nodes
# given the notebook above
Esin = integrate_over_histogram(approx, out).mean()

ferrine · November 5, 2018, 9:11pm

I think that integration of an expression should be implemented in future. What matters here is efficiency, given a chain of 10000 samples it is weird to average over them all. Some smart way of choosing important samples is needed.

UPDATE: added initial implementation here https://github.com/pymc-devs/pymc3/pull/3244

Topic		Replies	Views
Bayesian linear regression with SVGD Questions	4	804	January 21, 2021
SVGD convergence	0	93	June 6, 2024
How to make out-of-sample predictions with pymc model v5	1	654	February 8, 2023
SVGD: plot particles Questions	0	429	June 29, 2022
Model averaging with pymc instead of pymc3 version agnostic	4	54	August 3, 2024

Make predictions with the particles of SVGD

Related topics