# Truncated/Bounded Multivariate Normal?

I ran into a question on StackOverflow where someone wanted to impose a strong multivariate Gaussian prior on the coefficients in a binomial regression model. I coded up the following PyMC3 model as a suggested solution for them:

``````import numpy as np
import pymc3 as pm
import theano
import theano.tensor as tt
import matplotlib.pyplot as plt
import arviz as az

# Data
X = np.array([-0.86, -0.30, -0.05, 0.73])
N = np.array([5, 5, 5, 5])
Y = np.array([0, 1, 3, 5])

# augment X for simpler regression expression
X_aug = tt.stack(np.ones_like(X), X).T

# Prior params
mu = np.array([0, 10])
sd = np.array([22, 102])
Rho = np.array([[1, 0.5],[0.5, 1]])
Sigma = np.outer(sd, sd) * Rho

with pm.Model() as binomial_regression:
# regression coefficients (strong prior)
beta = pm.MvNormal('beta', mu=mu, cov=Sigma, shape=2)

# death probability
theta_i = pm.Deterministic('theta_i', pm.invlogit(X_aug.dot(beta)))

# outcomes
y_i = pm.Binomial('y_i', n=N, p=theta_i, observed=Y)

trace = pm.sample(10000, tune=100000, target_accept=0.8, random_seed=2018)
``````

However, there were many divergences without a high number of tuning steps, and even then I still got one. I donâ€™t find the location of the divergence to be particularly insightful:

I do, however, find the posterior to be insightful about the inappropriateness of the prior, namely that the slope coefficient should really never be negative:

So, I thought a better solution that would still allow for the modelerâ€™s strong prior would be to use a truncated Gaussian that restricts the second coefficient to be nonnegative. However, Iâ€™m not sure how to do this, hence the question:

How does one make a truncated/bounded multivariate normal in PyMC3?

Bonus Points: If anyone has thoughts about the divergence issue, those are also welcome. Iâ€™m still pretty new to HMC/NUTS, so my experience with reparameterizing models is rather weak.

Maybe the easiest is to put a potential in the model, something like `pm.Potential("bound", tt.switch(beta[0]>0, 0, -np.inf)` which makes the sampler rejected any beta0 smaller than 0.

2 Likes

Out of curiosity, how would I specify multiple bounds? Letâ€™s say I have a bivariate distribution, and I want both beta[0] and beta[1] to be non-negative.
The following (admittedly naive) approach does not work:

``````pm.Potential("bound", tt.switch(beta[0] > 0 and beta[1] > 0, 0, -np.inf)
``````

How about: `pm.Potential("bound", tt.switch((beta[0] > 0)*(beta[1] > 0), 0, -np.inf))`?

2 Likes

Thanks a lot. In principle, this works.

However, I realized some undesired effect the introduction of a potential has on the posterior distribution. Even if I specify a trivial potential

``````muBounded = pm.Potential("muBounded", tt.switch(1, 0, 0))
``````

the sampled posterior distribution changes. Is this expected, and if so, why?

Hereâ€™s the model specification with potential:

``````with pm.Model() as update_model:
# non-centered parametrization
mean_error = pm.Normal('mean_error', 0, 1, shape=dimension)
mu = pm.Deterministic('mu', mu_marginal_mean + tt.dot(mu_cholesky, mean_error))
muBounded = pm.Potential("muBounded", tt.switch(1, 0, 0))

L_packed_error = pm.Normal('L_packed_error', 0, 1, shape=packedMatrixDimension[dimension])
L_packed = pm.Deterministic('L_packed', L_packed_marginal_mean + tt.dot(L_packed_cholesky, L_packed_error))

obs = pm.MvNormal('obs', muBounded, chol=pm.expand_packed_triangular(dimension, L_packed), observed=observations)
``````

This is the original model without potential:

``````with pm.Model() as update_model:
# non-centered parametrization
mean_error = pm.Normal('mean_error', 0, 1, shape=dimension)
mu = pm.Deterministic('mu', mu_marginal_mean + tt.dot(mu_cholesky, mean_error))

L_packed_error = pm.Normal('L_packed_error', 0, 1, shape=packedMatrixDimension[dimension])
L_packed = pm.Deterministic('L_packed', L_packed_marginal_mean + tt.dot(L_packed_cholesky, L_packed_error))

obs = pm.MvNormal('obs', mu, chol=pm.expand_packed_triangular(dimension, L_packed), observed=observations)
``````

As you can see from the posterior distribution shown below (see second plot), the sampled distribution probabilities no longer center around the â€śtrueâ€ť value, which are represented by black vertical lines. Interestingly, it only affects the variable that is not constrained directly (L_packed, and not mu to which I apply the bound).

The full code of the MWE, as well as the corresponding plot of the model without potential is available in this thread: Updating multivariate priors

Iâ€™d be very curious on any opinion on this!

Likely there are other things interacting with your parameters - did you try to evaluate the logp to see if they are the same?

Thanks for the hint!

It turns out I just did a basic mistake. If I introduce a trivial potential (note the second argument within switch(), which is now mu instead of 0!)

``````muBounded = pm.Potential("muBounded", tt.switch(1, mu, 0))
``````

it all works as expected.

1 Like