Building a matrix using a distribution

Hello!
I have a model where I want to define a matrix and then take its eigenvalues.
In the model I have defined a uniform distribution c. I want to put this in a matrix along with some other values and then take the matrix eigenvalues as mu.
I have tried this and numpy doesnt seem to allow me to do this, does anyone have any pointers?
Thanks in advance.

You can assign values to a tensor with pytensor.tensor.set_subtensor, and compute eigenvalues (and vectors) with pytensor.tensor.linalg.eig.

For example, the following code sets some values of a 2x2 matrix then computes the eigenvalues:

import pymc as pm
import pytensor.tensor as pt
import numpy as np

with pm.Model():
    a = pm.Uniform('a', lower=0, upper=1, size=(2,))
    b = pm.Normal('b', 0, 1)
    A = pt.zeros((2, 2))
    A = pt.set_subtensor(A[np.diag_indices(2)], a)
    A = pt.set_subtensor(A[1, 0], b)
    
    vecs, vals = pt.linalg.eig(A)

pm.draw([A, vals])
3 Likes

Thank you very much! It works, though my code has to do it many times in a loop, changing the value of A[0,0] to -c and A[1,1] to c, I realised it goes slower and slower the longer the loop goes. Is pt.set_subtensor slow compared to numpy and is there a way to improve it??

Can you post some code please, I don’t really understand what you mean

Absolutely! So this first code works as I hoped it would.

import pymc as pm
import pytensor.tensor as pt
import numpy as np
import matplotlib.pyplot as plt
import arviz as az
measuredenergi=np.array([1.22417741,1.12121628,1.13530525,1.06916056,0.99403318,1.13906148
,0.9963757,1.03662053,0.96616282,0.97534615,0.92742552,0.89681969
,0.90942083,0.91659131,0.8283577,0.90655703,0.91469133,0.81667963
,0.84075108,0.73064756,0.78810323,0.75291322,0.76878005,0.69239133
,0.61189092,0.7441611,0.64737675,0.67928107,0.60894499,0.5628143
,0.64304116,0.63722566,0.65166106,0.68788886,0.61557561,0.56532491
,0.59084165,0.5206008,0.54341614,0.50496302,0.63723663,0.53319507
,0.52496636,0.57126947,0.5196219,0.48941256,0.54389447,0.576038
,0.49522662,0.54707526,0.459688,0.46440294,0.51040275,0.47803426
,0.4977562,0.52616195,0.56063771,0.58134653,0.56561023,0.54017933
,0.47294288,0.55847432,0.45539851,0.55623116,0.57084535,0.62619433
,0.59952174,0.63463671,0.66071148,0.66997596,0.58675376,0.657091
,0.72096573,0.74726991,0.67122209,0.71615178,0.70563827,0.7470409
,0.86356687,0.75571122,0.82181138,0.7863867,0.74713034,0.92471248
,0.90326655,0.88466562,0.89580531,0.94574585,0.93914835,0.94785712
,0.96314902,0.98087637,0.9635514,0.99606092,0.89498034,1.06921987
,1.04185664,1.03358329,1.16706843,1.1577956,1.09599788])


omega = np.linspace(-1,1,101)

with pm.Model() as model:
    g=pm.Uniform('g',lower=-0.8, upper=0.8)
    likelihood = pm.Normal("y", mu=np.sqrt(omega**2+g**2), observed=measuredenergi)
    trace = pm.sample(draws=4000, chains=4, cores=4)
az.plot_trace(trace)
plt.show()

Now what I would like to do is generalize mu=np.sqrt(omega^2+g^2) to the 3x3 matrix eigenvalues, but Im starting with the 2x2 case that should get the same value as the first code.
This is what I’ve written.


with pm.Model() as model:
    g=pm.Uniform('g',lower=-1, upper=1)
    A = pt.zeros((2, 2))
    A = pt.set_subtensor(A[1, 0], g)
    A = pt.set_subtensor(A[0, 1], g)
    valarray = np.array([0,0])

    for a in range(101):
        A = pt.set_subtensor(A[0, 0], -omega[a])
        A = pt.set_subtensor(A[1, 1], omega[a])
        values, vecs = pt.linalg.eig(A)#Calculates the eigenvalues and dumps the eigenvectors.
        values=pm.draw([values])
        values=values[0]
        if values[0]<0:
            values = np.flip(values)
        valarray = np.vstack([valarray,values]) #Stacks the eigenvalues in a n,m matrix with the 1-dimension is a vector of the eigenvalues and 2-dimension is the corespondeing phi index.
    valarray=valarray[1:,:]
    valarray=valarray[:,0]
    print(valarray)
    likelihood = pm.Normal("y", mu=valarray, observed=measuredenergi)
    trace = pm.sample(draws=4000, chains=4, cores=4)
az.plot_trace(trace)
plt.show()

Now what I get out is a flat distribution as if g has no effect on the answer. What I was talking about before is that it took quite a bit more time but as long as it will work its no problem.
Thanks again!

Sorry, my code wasn’t clear. You don’t actually want to use pm.draw – this was just to illustrate to you that the compute graph I made worked as expected. In a pymc model, you never want to evaluate any of the symbolic values at all. By using pm.draw in the middle there, you’re cutting off the flow of information from the logp to the random variables, so only a single eigenvalue is ever observed (the one associated with the value of g that happened to be drawn).

All Most operations in pytensor, including linear algebra, are vectorized. So you don’t need to loop here, you can just think in vectors:

eigh_vec = pt.vectorize(pt.linalg.eigh, '(m,m)->(m),(m,m)')

with pm.Model():
    g = pm.Uniform('g',lower=-1, upper=1)
    n = omega.shape[0]
    A = pt.zeros((n, 2, 2))
    A = pt.set_subtensor(A[:, np.diag_indices(2)], g)
    A = pt.set_subtensor(A[:, 0, 0], -omega)
    A = pt.set_subtensor(A[:, 1, 1], omega)
    A_eigvals, A_eigvecs = eigh_vec(A)
    A_eigvals = pt.sort(A_eigvals, axis=-1)
    
    energy_hat = pm.Normal("energy_hat", mu=A_eigvals[:, -1], observed=measuredenergi)
    idata_nuts = pm.sample(init='jitter+adapt_diag_grad', chains=8)
    idata_smc = pm.sample_smc(cores=1, chains=4)

Be aware that gradients for the generalized eigenvalue problem (eig) are NOT implemented. Only gradients for eigh are implemented, but actually you should use eigh here because it seems like A is symmetric (as long as this will be true in the 3x3 case). eigh is also supposed to return the eigenvalues and vectors in ascending order, but it didn’t seem to in my testing, which might be a bug.

Unfortunately, we didn’t vectorize eigh yet – vectorized linear algebra functions are quite new. So I had to vectorize it myself, and you can see how that looks. It’s quite painless.The signature "(m,m)->(m),(m,m)" tells pytensor what the core dimension are: the input will be an (m,m) matrix, and it will output a vector of length (m) and a matrix of shape (m,m). Once it’s vectorized, you’re allowed to add as many batch dimensions are you like to the left of the core dims. In our case, A is (100, 2, 2) so m=2, and the batch dim is the first one of size 100. The eigenvalue computation is automatically broadcast across the batch dimension.

Anyway, what is a bit painful is sampling the model – the strong multimodality makes it a bit rough on NUTS. I had much better success with SMC. Here’s idata_nuts:

And here’s idata_smc:

If in the general case you don’t have symmetric A matrices, you must use eig, not eigh.
:warning: eigh does NOT check that the input matrix is actually symmetrical :warning:. It will just happily return incorrect eigenvalues if you give it a non-symmetric matrix

The upshot of SMC working well is that eig doesn’t have gradients implemented, so you’ll have to use SMC in that case.

4 Likes

Yeah I noticed the draw did that but didn’t realize how I could go about not using it haha, and this solved it. Now I understand how it works, thank you very you are amazing! It is so cool that it is vectorized so that I don’t need the loop. Huge thanks, king!

3 Likes