If you want to constrain the covariance matrix like this, you’re going to have to deal with the fact that not every resulting covariance matrix will be positive definite, so you will need to define your priors in a very specific way.
To illustrate, think about decomposing the covariance matrix as described in footnote 16 of the Stan manual - as \Sigma = \text{Diag}(\sigma) \times \Omega \times \text{Diag}(\sigma), where \sigma is a k \times 1 vector of variances and \Omega is a k \times k matrix of correlation coefficients. In principal, the diagonal of \Omega should be all ones, but we can think about a degenerate family of \Omega where we set the diagonal such that the diagonal of \Sigma is all ones.
Take the case where k=2, and define \Omega = \begin{bmatrix} a & b \\ c & d \end{bmatrix} and \sigma = [\sigma_1, \sigma_2]^T. This gives:
\Sigma = \begin{bmatrix}\sigma^2_1 a & \sigma_1\sigma_2 b \\ \sigma_1 \sigma_2 b & \sigma^2_2 d \end{bmatrix}
Since we want to restrict the diagonal of \Sigma to 1, define a = \frac{1}{\sigma_1^2} and d = \frac{1}{ \sigma^2_2}, then we can set b = c = 1.
In PyMC it might look like this:
with pm.Model() as m:
sigma = pm.Uniform('sigma', lower=1e-6, upper=1 - 1e-6, shape=2)
sigma_diag = tt.diag(sigma)
omega = tt.as_tensor(np.full((2, 2), 1.0, dtype='float32'))
omega = tt.set_subtensor(omega[np.diag_indices(2)], sigma ** (-2))
cov = pm.Deterministic('cov', tt.dot(tt.dot(sigma_diag, omega), sigma_diag))
mean = pm.Normal("mean", 0, 10, shape=2)
pm.MvNormal("y" ,mu=mean, cov=cov, shape=(1000,n), observed=data)
As I mentioned, however, this family of covariance matrices is degenerate, and will not always be positive definite. To see this, we solve for eigenvalues \lambda using this characteristic polynomial:
\lambda^2 - (\sigma_1^2 a + \sigma^2_2 d) \lambda - (cb - ad)\sigma^2_1 \sigma^2_2 = 0
If you plug this into the quadratic equation, you will find that the positive branch is always positive, but the negative branch is only positive when cb > ad. We require both branches – that is, both eigenvalues – to be positive in order for this to be a valid covariance matrix. Using the values for a, b, c, d found above, this amounts to:
\sigma_1 \sigma_2 < 1
Obviously this is only for the k=2 case, but in general you are always going to have a system of k-1 inequalities that you need to respect in order to get a valid covaraince matrix. If all the values of \sigma are less than 1 you should be ok (at least it seems from some quick numeric tests, I can’t prove this). If any of them go above 1, you will have to check that the system of inequalities holds.
So you could implement all this, and place priors on your \sigma variables to ensure they stay in a “good” region, but at some point you have to stop and ask yourself “why am I doing this? Is it really necessary?”
Edit:
I was thinking about this more and you can interpret the off-diagonal elements of a restricted covariance matrix like the one you proposed as correlation coefficients, so I went through all that to show that correlations are between 0 and 1. Not exactly covering myself in glory here.
Here’s my shot at redemption:
k = 4
true_cov = np.random.uniform(0, 0.3, size=(k, k))
true_cov= true_cov @ true_cov.T
true_cov[np.diag_indices(k)] = 1.0
mu = np.ones(k)
data = stats.multivariate_normal(mean=mu, cov=true_cov).rvs(1000)
with pm.Model() as m:
bounded_halfnorm = pm.Bound(pm.HalfNormal, 0, 1)
# need sum(1 + 2 + ... + (k-1)) correlation coefficients
rhos = bounded_halfnorm('rhos', sigma=0.1, shape=(k * (k + 1) // 2) - k)
cov = tt.eye(k)
cov = tt.set_subtensor(cov[np.tril_indices(k, -1)], rhos )
# using np.triu_indices doesn't make cov symmetric?
cov = tt.set_subtensor(cov[np.tril_indices(k, -1)[::-1]], rhos)
cov = pm.Deterministic('cov', cov)
mean = pm.Normal("mean", np.zeros(k), np.ones(k), shape=k)
y_hat = pm.MvNormal("y" ,mu=mean, cov=cov, shape=(1000, k), observed=data)
prior = pm.sample_prior_predictive(var_names=['cov', 'y'])