Create an array of standard uniform distributions

I want to create:

  1. An square array X of independent random variables. Each element of the array follows the standard uniform distribution
  2. Same as X, except each column is constrained to sum to one. To sample from this distribution, I can use y = np.random.uniform(0, 1, size=(5, 5)); y /= np.sum(y, axis=0)

I absolutely need 1 and 2 is nice to have.

What is the correct way to do this for an array of 100 x 100 random variables?

I was thinking about the follow way, which seems horribly inefficient.

Test

import theano.tensor as tt
import pymc3 as pm


# with shape var_shape
var_shape = (4, 4)
with pm.Model() as opmodel:
    var_array = []
    for i in range(var_shape[0]):
        for j in range(var_shape[1]):
            var_now = pm.Uniform('a_%d_%d' % (i, j),
                                 lower=0, upper=1)
            var_array.append(var_now)

    # Theta is an array of independent
    # uniform random variables
    theta = tt.as_tensor_variable(var_array)
    theta = theta.reshape(var_shape)

    # Normalize each column of the array
    normalized_theta = theta / tt.sum(theta, axis=0)

    # Use theta as the prior distribution for NUTS ...

np.random.uniform(0, 1, size=(5, 5))

This works for pymc3

pm.Uniform('x', 0, 1, shape=(5,5))

For (2) just use a Dirichlet:

>>> with pm.Model() as mod:
...   u = pm.Dirichlet('u', np.ones(5, dtype=np.float32), shape=(5,5))
...   tr = pm.sample_prior_predictive(8)
... 
>>> tr['u'][0,:,:]
array([[0.32075288, 0.21272646, 0.03227816, 0.11944185, 0.31480065],
       [0.16121582, 0.12052498, 0.04596736, 0.44466906, 0.22762277],
       [0.00860465, 0.15319304, 0.67191376, 0.11300245, 0.0532861 ],
       [0.02679962, 0.36995717, 0.12941121, 0.15723846, 0.31659354],
       [0.22661067, 0.09003119, 0.50470718, 0.14877758, 0.02987339]])
>>> tr['u'][0,:,:].sum(axis=1)
array([1., 1., 1., 1., 1.])
1 Like