Scale in potential seems backwards


#1

In step_methods.hmc.quadpotential.QuadPotentialDiag

The constructor sets

        self.s = s
        self.inv_s = 1. / s
        self.v = v

where v is the “Diagonal of covariance matrix for the potential vector”

But then random and the energy are defined as

    def random(self):
        """Draw random value from QuadPotential."""
        return floatX(normal(size=self.s.shape)) * self.inv_s

    def energy(self, x, velocity=None):
        """Compute kinetic energy at a position in parameter space."""
        if velocity is not None:
            return 0.5 * np.dot(x, velocity)
        return .5 * x.dot(self.v * x)

This is backwards. You multiply by the scale to get a random sample and divide by the variance to get the pdf of a Gaussian.

Is this a bug? Or there is some intentional implicit transformations here? All the quad potential classes are the same way.


#2

Using the Euclidean-Gaussian kinetic energy, you can set the inverse Euclidean metric to the target covariances (estimated during tuning, e.g. tune=1000). The kinetic energy is computed as 0.5 * p' * M^-1 * p + log|M| + const, if you only take the diagonal of M^-1 that is v above. The momentum is given as Normal(p | 0, M), which you scale with the inverse of the potential vector as above. You can find more information in Betancourt’s paper (chapter 4.2).


#3

Ok, it looks like it is an intentional transformation then. Maybe the variable names should be changed to clarify scale_posterior vs scale_energy.

So, it looks like QuadPotentialDiag() should be called with the (guess-)estimated variances of the posterior. There is also some inconsistency when setting this up via the step classes. Since scaling in HamiltonianMC() seems to refer to the variances whereas scaling in Metropolis() seems to refer to the standard deviation. This has a lot of potential to mess people up.


#4

Yes, that is a good point. The scaling in HMC is actually much more flexible - it could be (the diagonal of) a covariance matrix or a precision matrix. We should have a detail docs to explain the differences.