Multivariate Laplace Prior

I have an inference problem where I am using pm.Laplace priors in order to introduce regularization into my model. However, I would like this regularization to be pooled across multiple parameters via the introduction of a multi-variate Laplace distribution. Is this something that is available in PyMC, and if not, is it something that I can program into it?

The bottom of the wikipedia article gives you everything you need to write a generative graph. If you have a generative graph, you can use a pm.CustomDist to easily implement the distribution. In the background, PyMC will automatically derive a logp for the distribution:

import pymc as pm
import pytensor.tensor as pt

def multivariate_laplace(mu, chol, size=None):    
    Y = pm.MvNormal.dist(mu=0, chol=chol, size=size)
    broadcast_size = pt.broadcast_shape(mu, Y)
    W = pm.Exponential.dist(lam=1, size=broadcast_size)

    return pt.sqrt(W) * Y + mu

Which you can use as follows:

# I am going to use a single known covariance to check we can recover it,
# in reality you would use a normal random variable here
chol_rv, *_ = pm.LKJCholeskyCov.dist(n=3, eta=1, sd_dist=pm.Gamma.dist(2, 1))
chol_draw = pt.as_tensor_variable(pm.draw(chol_rv))

# Also fix a mu for testing, so variance from draws of mu don't get into beta
mu_rv = pm.Normal.dist(mu=[1, 10, 100], sigma=10)
mu_draw = pt.as_tensor_variable(pm.draw(mu_rv))

with pm.Model() as m:
    # Sample from the data generating process
    beta = pm.CustomDist('beta', mu_draw, chol_draw, dist=multivariate_laplace)
    idata = pm.sample_prior_predictive(samples=10_000)

Note that I made a change from the wikipedia article, because I don’t multiply the mean by W. I found that this generation process worked for mu=0 (I was able to recover the desired covariance matrix from samples), but not for \mu \neq 0. I checked the source of wikipedia’s algorithm (Kotz, Kozubowski, and Podgórski, 2001), but the wiki faithfully reproduces their algorithm. Just shifting samples, though, allowed me to recover both the mean and covariance. Here’s the samples I get:

I will note that the algorithm proposed on the wiki comes from the chaper on Assymetric Laplace Distributions, so that might be why there’s a difference?

You’ll need to provide a custom logp, PyMC won’t be able to derive that (convolution of two latent RVs implied by the product)

1 Like

Ah shoot I forgot to check that. I was really getting into playing with the generating function.

Well the wiki has the logp function too, so it just means you give a logp function (written in pytensor) to CustomDist, instead of the generating graph (dist).

You can define both, so it’s usable for both logp and forward sampling. We could add this to pymc-experimental, under the same molds of Implement Skellam distribution by wd60622 · Pull Request #260 · pymc-devs/pymc-experimental · GitHub

Thanks for the input! Then, for the logp, should I just implement the log of the likelihood function listed on the wiki? If so, how do I implement the Bessels function? Is there an option for that in Pytensor?

Look like we have bessel functions of the first kind (pt.j0 and pt.j1), which you could use to build the second kind (check the article on Bessel functions).

Some of the math heavy hitters might have better insight into whether this might be more complex (har har) than it seems – @maresb @aseyboldt ?

Looks like there’s also pt.iv

1 Like

Documentation PRs welcome!