I am sampling multiple parameters associated with a 3D Gaussian:

```
with pm.Model() as model:
mu_1 = pm.Uniform('mu_1', lower=0,upper=1)
mu_2 = pm.Uniform('mu_2', lower=0,upper=1)
mu_3 = pm.Uniform('mu_3', lower=-1,upper=1)
std_1 = pm.Uniform('std_1',lower=0.01,upper=1.0)
std_2 = pm.Uniform('std_2',lower=0.01,upper=1.0)
std_3 = pm.Uniform('std_3',lower=0.01,upper=1.0)
rho = pm.Uniform('rho', lower=-1,upper=1)
##### rest of the model code afterwards...
```

My problem is that I want mu_2 <= mu_1 whereas the above code assumes a uniform prior over 0-1 for both independently (so you could end up with mu_2 > mu_1). mu_3 can be anything between -1 to 1. What I want is illustrated in the following figure: if you plot mu_2 vs. mu_1, the sampled points should always fall below the one-to-one diagonal line, i.e., uniformly in the lower triangular area.

Does PyMC have such a lower triangular Uniform prior thatâ€™s easy to use to enforce mu_2 <= mu_1?

For example, can I simply use mu_1 as the upper bound for mu_2 such that in any iteration, the sampler first picks mu_1, then chooses mu_2 to be lower or is that unallowed / ill-behaved? My guess is this would require sequential sampling whereas I want to use NUTS so I need something else that jointly draws (mu_1,mu_2) from a uniform lower triangular support.

```
mu_1 = pm.Uniform('mu_1', lower=0,upper=1)
mu_2 = pm.Uniform('mu_2', lower=0,upper=mu_1)
```

1 Like

Sampling from a triangle without a builtin sampler/distribution is a bit tricky. Your naive idea can actually be sampled with NUTS but you will probably get biased samples towards the (0,0) corner.

I also tried to just use the ordered transform, but it still doesnâ€™t look uniform.

I got interesting pattern by sampling from a uniform Dirichlet with 3 and ignoring one of the dimensions. This is not what you wanted, but itâ€™s a triangle

The best though was by sorting two unconstrained uniforms and calling that my variable.

1 Like

Oh nice on the Dirichlet! How/why does that work? Does it mean Iâ€™m sampling 1 extra unnecessary variable, and will that mess up my convergence / increase runtime?

Another idea: here is a formula that generates two random uniform numbers and transforms them to return a point uniformly within a triangle defined by its three vertices: python - Generate random locations within a triangular domain - Stack Overflow

In our case of a lower triangle, A = (0,0), B = (1,0), C = (1,1)

I donâ€™t think it actually works, but the nice looking result comes from the fact that the values canâ€™t sum to more than one.

Ah, okay in your case of sorting two uniforms, how do I actually access/use mu_1 and mu_2 inside the model? Would that work forme? For example I need to do:

```
with pm.Model() as m:
x_latent = pm.Uniform("x_latent", 0, 1, shape=(2,))
x = pm.Deterministic("x", x_latent.sort())
### set up and evaluate MvNormal PDF within the rest of my model
mu_vector = pt.stack([mu_1,mu_2,mu_3])
### now build the covariance matrix using the above stochastic RVs
cov_matrix = pt.stack([[std_1**2, 0.0, rho*std_1*std_3],
[0.0, std_2**2, 0.0],
[rho*std_1*std_3, 0.0, std_3**2]])
mvn = pm.MvNormal.dist(mu=mu_vector,cov=cov_matrix)
probs_params = pt.exp(pm.logp(mvn,points)) # I need to evaluate MvNormal PDF at list of points
```

I updated my message, the Dirichlet is not what you wanted. I wonder if it can be convert to what you want by some trivial transformation.

You can use the x just like you would use latent_x.

`mu_1 = x[0]`

if I understand you correctly

ah yes, that makes sense, let me try

If you have only 3 variables you can also define them separately and then mu is something like `pt.stack([pt.min(x0, x1), pt.max(x0, x1), x2])`

. Doesnâ€™t really matter how you combine the variables

that wonâ€™t lead to any weird convergence issues or problems with the gradient for NUTS?

Thatâ€™s a good question and honestly I donâ€™t know. I could imagine multi-modality arising easily.

The naive solution may also work just fine if the posterior is well informed.

Hmm yeah Iâ€™m specifically testing the impact of a uniform vs non-uniform prior over this triangular region in the limit of small sample sizes, so I need a uniform lower triangular sampling.

Why wouldnâ€™t your Dirichlet idea work? Because of the extra unnecessary variable?

Iâ€™m running your sorted approach now. I will also try this formula based on re-scaling two uniform randoms: python - Generate random locations within a triangular domain - Stack Overflow

One last dumb question: in your sorted approach, you have an upper triangular matrix whereas I need a lower one. Is there a simple way to change `x = pm.Deterministic("x", x_latent.sort())`

to achieve this? Currently I am using your `pt.stack([pt.min(x), pt.max(x), mu_3]) `

idea.

@ricardoV94 I think I may have got it based on python - Generate random locations within a triangular domain - Stack Overflow

This is just a simple algebraic transformation of two Uniform(0,1) randoms into a new point (x,y) uniformly within a triangle given its 3 vertices. For our lower triangle, the vertices are `tri_A = (0,0); tri_B = (1,0); tri_C = (1,1)`

.

```
r1 = pm.Uniform('r1', lower=0,upper=1)
r2 = pm.Uniform('r2', lower=0,upper=1)
s1 = pt.sqrt(r1)
mu_2 = tri_A[0] * (1.0 - s1) + tri_B[0] * (1.0 - r2) * s1 + tri_C[0] * r2 * s1
mu_1 = tri_A[1] * (1.0 - s1) + tri_B[1] * (1.0 - r2) * s1 + tri_C[1] * r2 * s1
```

Note the reversed order of assignment of mu_2 and mu_1 (this ensures mu_2 <= mu_1 always).

NUTS was auto-assigned and is currently running.

I guess in this case r1 and r2 are latent variables?

1 Like

Nevermind, the simple algebraic transformation version is taking forever to sample for some reasonâ€¦

Surprising this is so hard since itâ€™s such a simple thing to sample 2 variables below the 1:1 diagonal lineâ€¦

Edit: anyone know if I can add some kind of if/else statement I can add so that if mu_2 > mu_1, then the model just returns logp=-inf for the current parameter set? Of course this wouldnâ€™t work with NUTS but Slice sampler is fineâ€¦

You can add a Potential, but Iâ€™m sure that will yield the same bias when sampling from the prior as your naive solution.

This need not be a problem.Your posterior draws, which is what you care about may not be biased just because the prior ones are when using mcmc

Sampling constrained variables with mcmc without a custom gibbs sampler can be quite tricky. You are thinking about how easy forward/random draws are, but for mcmc sampling I wouldnâ€™t expect it to be easy.

Even forward draws can be tricky, such as when trying to draw uniformly from a disc: https://m.youtube.com/watch?v=4y_nmpv-9lI&t=276

Maybe this is helpful? Defining priors for dependent random variables in STAN - #2 by jsocolar - Modeling - The Stan Forums

The suggestion in model 1 is to model x2 as the fraction of x1. I think that doesnâ€™t cut it.

The suggestion is model 2 is to add a penalty term for lower values of x1, so that the joint probability is uniform, not the conditionals, which you could do with a Potential?

As mentioned this means the marginal prior probability wonâ€™t look uniform which I think makes sense.

The potential adjustment seems to do the trick!

```
import pymc as pm
with pm.Model() as m:
x1 = pm.Uniform("x1", 0, 1)
x2 = pm.Uniform("x2", 0, x1)
# Penalize low x1, so that joint of x1,x2 is uniform
pm.Potential("pot", pm.math.log(x1))
x = pm.Deterministic("x", pm.math.stack([x1, x2]))
idata = pm.sample()
pm.plots.plot_pair(idata, var_names=["x"], marginals=True);
```

1 Like

THANK YOU @ricardoV94 ! I will start a run with this now.

My actual model assumes a simple Poisson likelihood at the end â€“ do I need to do anything else to my model with-block for it to take into account the new Potential?

Also, can Potentials introduce convergence/divergence/slow-down issues with NUTS?

```
mu_1 = pm.Uniform('mu_1', lower=0,upper=1)
mu_2 = pm.Uniform('mu_2', lower=0,upper=mu_1)
# Ricardo's Potential idea to penalize small mu_2 so joint (mu_1,mu_2) is uniform
pm.Potential("pot", pm.math.log(mu_2))
mu_3 = pm.Uniform('mu_3', lower=-1,upper=1)
std_1 = pm.Uniform('std_1',lower=0.01,upper=1.0)
std_2 = pm.Uniform('std_2',lower=0.01,upper=1.0)
std_3 = pm.Uniform('std_3',lower=0.01,upper=1.0)
rho = pm.Uniform('rho', lower=-1,upper=1)
# use current sample of mu and std's to evaluate MvNormal at some pre-defined points
mu_vector = pt.stack([mu_1,mu_2,mu_3])
cov_matrix = pt.stack([[std_1**2, 0.0, rho*std_1*std_3],
[0.0, std_2**2, 0.0],
[rho*std_1*std_3, 0.0, std_3**2]])
mvn = pm.MvNormal.dist(mu=mu_vector,cov=cov_matrix)
probs_params = pt.exp(pm.logp(mvn,points))
### Other model code here to compute expected value for Poisson ###
# finally: define Poisson likelihood
z_obs = pm.Poisson('z_obs', mu=mu_expected, observed=data)
### does Potential "pot" have to be used anywhere else after being defined?
```

No, it should work fine and you donâ€™t need to use potential anywhere.

1 Like