Specifying a flat Dirichlet distribution

gillitf1 · September 4, 2024, 3:53pm

I would like to specify a K-dimensional flat Dirichlet distribution such that each dimension is effectively sampled uniformly, and such that each K-dimensional sample sums to 1.

Based on Wikipedia’s Dirichlet distribution article, I was expecting that if all K concentration parameters were set to 1, each dimension would be uniformly sampled. That is, if I drew many samples from pm.Dirichlet.dist(a=np.ones(K)) then the K marginal distributions would all be approximately uniform or flat. The following snippet attempts to test this.

import numpy as np
import matplotlib.pyplot as plt
import pymc as pm
import arviz as az

n_params = 4
samples = pm.draw(pm.Dirichlet.dist(a=np.ones(n_params)), draws=2048)
print(samples.shape)
print(np.all(np.isclose(1, samples.sum(axis=-1))))
fig, ax = plt.subplots(nrows=n_params, sharex=True)
for i_param, _ax in enumerate(ax):
    az.plot_dist(samples[...,i_param], rug=False, ax=_ax)

>> (2048, 4)
>> True

However, when I run the above snippet, the marginal distributions do not appear flat, but rather are skewed right. Below is the plot of the marginal distributions from the above snippet.

output

The “Create an array of standard uniform distributions” thread poses essentially the same question (see item 2 from that thread below), and the proposed solution is similar to the above snippet. Likewise, that proposed solution yields the same results as I’ve observed above.

An square array X of independent random variables. Each element of the array follows the standard uniform distribution

Same as X, except each column is constrained to sum to one. To sample from this distribution, I can use y = np.random.uniform(0, 1, size=(5, 5)); y /= np.sum(y, axis=0)

Am I misunderstanding how to specify a flat Dirichlet distribution? Is it possible to specify a Dirichlet distribution that is uniform over all points in its support? Perhaps I’m misunderstanding how the marginal distributions should look for a flat Dirichlet distribution?

Version info:

python                    3.12.2
pytensor                  2.25.2
pytensor-base             2.25.2
pymc                      5.16.2
pymc-base                 5.16.2
macOS 14.5
Apple M3 Max

gillitf1 · September 4, 2024, 4:14pm

Based on the following stackexchange thread, it seems that the marginal distributions of \text{Dirichlet}(\vec{\alpha}) are \text{Beta}(\alpha_i, \Sigma_{i \neq j = 1}\alpha_j).

calculus - Marginal of Dirichlet distribution is Beta (integral) - Mathematics Stack Exchange

In that case, if \text{Beta}(1, 1) is a uniform distribution [wikipedia], then perhaps it’s not possible to have all of a Dirichlet’s marginal distributions be uniform distributions. That is, perhaps it’s possible to specify a Dirichlet distribution that is uniform over all points in its support, but not such that all its marginal distributions are uniform.

ricardoV94 · September 4, 2024, 5:23pm

Yes, I don’t think what you want is possible with the Beta distribution (and I don’t know of any other distribution that would achieve that)

colcarroll · September 4, 2024, 5:58pm

This is neat. Here’s a little notebook: Google Colab

Interesting that you can have uniform marginals in dimension 2:

and in dimension 3 the 2-d marginals are uniform (i.e., p(x_1, x_2), p(x_1, x_3), and p(x_2, x_3) are all uniform in their support), though the 1-d marginals are these Beta(1, 2).

In higher dimensions, I guess you still get uniform (d-1)-dimensional “marginals” with a Dirichlet(1, …, 1), but the 1- and 2-d marginals get less and less uniform as the hyper-triangle concentrates mass.

bob-carpenter · September 8, 2024, 6:48pm

That’s tecnically impossible because the entries in the simplex generated from a Dirichlet are intrinsically correlated due to the sum-to-zero constraint. If one goes up, the sum of the others must go down.

If you try to generate N values from a beta distribution and add them together, the probability you’ll get a simplex of dimension N is zero because of the lower intrinsic dimensionality of simplexes. Simplexes with N entries are of dimension N - 1 because \theta_N = 1 - \theta_1 - ... - \theta_{N-1}, so the last beta distribution would be forced to generate exactly that value, which has probability 0 as a point in a continuous distribution.

If you set \alpha = 1, then \textrm{Dirichlet}(\alpha) is uniform over simplexes. That doesn’t mean that the entries in the simplex are marginally independent.

Topic		Replies	Views
Create an array of standard uniform distributions Questions	1	1657	May 6, 2019
Adding constraint to the parameter Questions	17	2625	June 22, 2018
Prior predictive samples with multidimensional parameters Questions	1	464	November 15, 2020
Inferring Dirichlet concentration parameter Questions	11	1209	May 13, 2020
Multiple Dirichlet sampling under linear constraints	1	165	December 18, 2023

Specifying a flat Dirichlet distribution

Related topics