Sampling in a n-ball with HMC

I want to sample in a n-ball (5-ball, for example) and obtain the samples. I want to compare this with rejection sampling, and show how with this last one you have to discard lot of samples in high dimensions since they’re out of the sphere.

(Using v3 still).

So, a uniform pdf in a 5-ball of radius r, r^2=x_1^2+x_2^2+x_3^2+x_4^2+x_5^2 would be

f(\boldsymbol{x})=\frac{\Gamma(\frac{5}{2}+1)}{\pi^{5/2}}(x_1^2+x_2^2+x_3^2+x_4^2+x_5^2)^{-5/2}

if the point is in the ball. I want to obtain the x values in a HMC.

So, I define this pdf

def logp(x_1, x_2, x_3, x_4, x_5):
    return (pm.math.log(scipy.special.gamma(7/2)/np.pi**(5/2)) -
               (5/2)*pm.math.log(x_1**2 + x_2**2 + x_3**2 + x_4**2
               +x_5**2)).sum()

The next step would be to call “DensityDist”, but now I need guidance because I am confused here. My objective is to get the values of x_1, x_2, ... obeying the equation r^2\geq x_1^2+x_2^2+x_3^2+x_4^2+x_5^2. How can I do that?

@lucianopaz

At PyMC Labs we actually had to write a probability distribution in an n-ball. The project will become open source later this year, but I can’t share the actual code until then.

We based our distribution from the methods described in this paper. Basically, a uniform distribution in a 5-ball, will have a piecewise pdf, like:

f(x) = \begin{cases} \frac{\Gamma(\frac{5}{2} + 1)}{\pi^{5/2}} & \text{if }||x||\leq 1\\ 0 & \text{if }||x||> 1\\ \end{cases}

Sampling here will be difficult for HMC because the gradient is 0 always until the points reach the bound of the sphere. If you want to be able to actually use HMC to draw samples in this space, you will need to implement a transform that maps points from \mathbb{R}^{5}\rightarrow \mathbb{B}^{5}_{2} (where \mathbb{B}^{5}_{2} stands for the unit ball in 5 dimensions under the L_2 norm).

The transform g(x), \mathbb{R}^{5}\rightarrow \mathbb{B}^{5}_{2} that we implemented was this:

g(x) = \frac{2x}{\pi||x||} \arctan(||x||)

We had a bit of trouble to write down the determinant of the jacobian transform (which you need to get the pdf in the unconstrained space, which enables sampling), but you basically end up with this expression:

J(g(x))= \frac{2 \vec{x}.\vec{x^t}}{\pi ||x||^{2} (1 + ||x||^{2})} - \frac{2 \vec{x}.\vec{x^t} \arctan(||x||)}{\pi||x||^{3}} + \frac{2 \arctan(||x||) \mathbb{I}}{\pi ||x||}

With that transform in hand, we could implement a new distribution that used the transform during sampling. That way, HMC becomes very efficient as the unconstrained space and the jacobian information, allow it to navigate the uniform ball rather well. Our probability distribution actually also had a concentration parameter, that allowed it to concentrate points at the center or on the shell:

If you still want to use a density dist without a custom transform, you can. But keep in mind, that HMC might still struggle because of the sharp discontinuity in the pdf. What you would have to do to get a DensityDist working is this:

import numpy as np
import pymc as pm
from aesara import tensor as at
from matplotlib import pyplot as plt


def logp(value):
    n = value.shape[-1]
    r = at.sqrt(at.sum(value ** 2, axis=-1))
    return at.switch(
        r <= 1, at.log(at.gamma(n + 0.5) / np.pi ** (n / 2)), -np.inf
    )


with pm.Model():
    pm.DensityDist("x", logp=logp, shape=2)
    step = pm.NUTS(target_accept=0.99, max_treedepth=4)
    idata = pm.sample(step=step)

x = idata.posterior.x.stack(samples=["chain", "draw"]).values
plt.plot(x[0], x[1], "o", alpha=0.6)

You’ll see that the trace is plagued by divergences because HMC has trouble when it hits the edge of the ball. Nevertheless, the posterior kind of looks like what you want to get:

image

5 Likes

Thank you for this great answer! The paper you linked seems heavy material for me, but I am interested in the details of the Jacobian calculation (if they can be shared, otherwise no problem at all).

One last question: for the case of PyMC3, it would be at -> tt (import theano.tensor as tt) and

x = idata.x #shape (4000, 2)
plt.plot(x[:,0], x[:,1], "o", alpha=0.6)

would this be correct? Thx again.

Yes, for pymc3 you only need to change the import of from aesara import tensor as at to from theano import tensor as at. The way in which you did the plot is ok because it will rely on the fact that the results will be a MultiTrace instance instead of an InferenceData instance. If you don’t want to change the plotting code, you can simply add return_inferencedata=True to the pm.sample call.

1 Like

What calculations are you interested in regarding the determinant of the Jacobian? I actually just realized that the math I pasted was the actual jacobian matrix, not the determinant. We compute the determinant using aesara.tensor.linalg.det and then take the log of that. I’ll edit my original reply to reflect that.

1 Like

Ah, ok, I see the jacobian as the derivative of g(x). No worries. Thx again.

@lucianopaz just thinking out loud here, but why not directly use either uniform sampling on the n-sphere via Gaussians or the von Mises-Fisher distribution which is more general?

Sorry if I’m asking something you can’t answer because of your work.

It’s no problem. There are three reasons.
The first is that we needed to sample points inside of a d-ball, not on the surface of a d-ball.
The second thing is that we actually did use a similar technique involving draws from Gaussian distributions to generate forward samples, but these aren’t enough when you need to do inference. To do inference, you need to be able to compute the probability density, and the pdf of the ratio of Gaussians isn’t easy to compute.
The last reason is that we needed the distribution to be rotationally invariant a priori, so we didn’t need to have something like a Von Mises distribution to set a preferred direction.

2 Likes

@ccaprani, if you’re interested in generating forward samples on the surface or within spheres, I recommend that you take a look here. We are using method 17 there to generate forward samples for our d-ball distribution. The pdf expression is the one written in corollary 3 of this paper

2 Likes

Ah understood @lucianopaz - thank you for the explanation and the links too!