Truncated/Bounded Multivariate Normal?

mfansler · October 27, 2018, 3:52am

I ran into a question on StackOverflow where someone wanted to impose a strong multivariate Gaussian prior on the coefficients in a binomial regression model. I coded up the following PyMC3 model as a suggested solution for them:

import numpy as np
import pymc3 as pm
import theano
import theano.tensor as tt
import matplotlib.pyplot as plt
import arviz as az

# Data
X = np.array([-0.86, -0.30, -0.05, 0.73])
N = np.array([5, 5, 5, 5])
Y = np.array([0, 1, 3, 5])

# augment X for simpler regression expression
X_aug = tt.stack(np.ones_like(X), X).T

# Prior params
mu = np.array([0, 10])
sd = np.array([22, 102])
Rho = np.array([[1, 0.5],[0.5, 1]])
Sigma = np.outer(sd, sd) * Rho

with pm.Model() as binomial_regression:
    # regression coefficients (strong prior)
    beta = pm.MvNormal('beta', mu=mu, cov=Sigma, shape=2)

    # death probability
    theta_i = pm.Deterministic('theta_i', pm.invlogit(X_aug.dot(beta)))

    # outcomes
    y_i = pm.Binomial('y_i', n=N, p=theta_i, observed=Y)

    trace = pm.sample(10000, tune=100000, target_accept=0.8, random_seed=2018)

However, there were many divergences without a high number of tuning steps, and even then I still got one. I don’t find the location of the divergence to be particularly insightful:

I do, however, find the posterior to be insightful about the inappropriateness of the prior, namely that the slope coefficient should really never be negative:

So, I thought a better solution that would still allow for the modeler’s strong prior would be to use a truncated Gaussian that restricts the second coefficient to be nonnegative. However, I’m not sure how to do this, hence the question:

How does one make a truncated/bounded multivariate normal in PyMC3?

Bonus Points: If anyone has thoughts about the divergence issue, those are also welcome. I’m still pretty new to HMC/NUTS, so my experience with reparameterizing models is rather weak.

junpenglao · October 27, 2018, 3:31pm

Maybe the easiest is to put a potential in the model, something like pm.Potential("bound", tt.switch(beta[0]>0, 0, -np.inf) which makes the sampler rejected any beta0 smaller than 0.

flurinus · March 12, 2019, 3:51pm

Out of curiosity, how would I specify multiple bounds? Let’s say I have a bivariate distribution, and I want both beta[0] and beta[1] to be non-negative.
The following (admittedly naive) approach does not work:

pm.Potential("bound", tt.switch(beta[0] > 0 and beta[1] > 0, 0, -np.inf)

Thanks a lot in advance!

junpenglao · March 12, 2019, 4:12pm

How about: pm.Potential("bound", tt.switch((beta[0] > 0)*(beta[1] > 0), 0, -np.inf))?

flurinus · March 13, 2019, 8:01am

Thanks a lot. In principle, this works.

However, I realized some undesired effect the introduction of a potential has on the posterior distribution. Even if I specify a trivial potential

muBounded = pm.Potential("muBounded", tt.switch(1, 0, 0))

the sampled posterior distribution changes. Is this expected, and if so, why?

Here’s the model specification with potential:

with pm.Model() as update_model:
    # non-centered parametrization
    mean_error = pm.Normal('mean_error', 0, 1, shape=dimension)
    mu = pm.Deterministic('mu', mu_marginal_mean + tt.dot(mu_cholesky, mean_error))
    muBounded = pm.Potential("muBounded", tt.switch(1, 0, 0))

    L_packed_error = pm.Normal('L_packed_error', 0, 1, shape=packedMatrixDimension[dimension])
    L_packed = pm.Deterministic('L_packed', L_packed_marginal_mean + tt.dot(L_packed_cholesky, L_packed_error))

    obs = pm.MvNormal('obs', muBounded, chol=pm.expand_packed_triangular(dimension, L_packed), observed=observations)

This is the original model without potential:

with pm.Model() as update_model:
    # non-centered parametrization
    mean_error = pm.Normal('mean_error', 0, 1, shape=dimension)
    mu = pm.Deterministic('mu', mu_marginal_mean + tt.dot(mu_cholesky, mean_error))

    L_packed_error = pm.Normal('L_packed_error', 0, 1, shape=packedMatrixDimension[dimension])
    L_packed = pm.Deterministic('L_packed', L_packed_marginal_mean + tt.dot(L_packed_cholesky, L_packed_error))

    obs = pm.MvNormal('obs', mu, chol=pm.expand_packed_triangular(dimension, L_packed), observed=observations)

As you can see from the posterior distribution shown below (see second plot), the sampled distribution probabilities no longer center around the “true” value, which are represented by black vertical lines. Interestingly, it only affects the variable that is not constrained directly (L_packed, and not mu to which I apply the bound).

Figure_1_bounded

The full code of the MWE, as well as the corresponding plot of the model without potential is available in this thread: Updating multivariate priors

I’d be very curious on any opinion on this!

junpenglao · March 13, 2019, 8:29am

Likely there are other things interacting with your parameters - did you try to evaluate the logp to see if they are the same?

flurinus · March 13, 2019, 8:56am

Thanks for the hint!

It turns out I just did a basic mistake. If I introduce a trivial potential (note the second argument within switch(), which is now mu instead of 0!)

muBounded = pm.Potential("muBounded", tt.switch(1, mu, 0))

it all works as expected.

Topic		Replies	Views
Truncated multivariate prior in hierarchical setting Questions constrain_param	3	865	January 8, 2022
Truncated multivariate normal likelihood v5 modeling	10	65	May 30, 2025
Multivariate linear regression with different prior distributions for coefficients v3 modeling	0	537	May 7, 2022
Truncated Normal Questions	8	1894	January 29, 2021
Multivariate Truncated Normal v5	1	457	July 26, 2022

Truncated/Bounded Multivariate Normal?

Related topics