Building a matrix using a distribution

Sorry, my code wasn’t clear. You don’t actually want to use pm.draw – this was just to illustrate to you that the compute graph I made worked as expected. In a pymc model, you never want to evaluate any of the symbolic values at all. By using pm.draw in the middle there, you’re cutting off the flow of information from the logp to the random variables, so only a single eigenvalue is ever observed (the one associated with the value of g that happened to be drawn).

All Most operations in pytensor, including linear algebra, are vectorized. So you don’t need to loop here, you can just think in vectors:

eigh_vec = pt.vectorize(pt.linalg.eigh, '(m,m)->(m),(m,m)')

with pm.Model():
    g = pm.Uniform('g',lower=-1, upper=1)
    n = omega.shape[0]
    A = pt.zeros((n, 2, 2))
    A = pt.set_subtensor(A[:, np.diag_indices(2)], g)
    A = pt.set_subtensor(A[:, 0, 0], -omega)
    A = pt.set_subtensor(A[:, 1, 1], omega)
    A_eigvals, A_eigvecs = eigh_vec(A)
    A_eigvals = pt.sort(A_eigvals, axis=-1)
    
    energy_hat = pm.Normal("energy_hat", mu=A_eigvals[:, -1], observed=measuredenergi)
    idata_nuts = pm.sample(init='jitter+adapt_diag_grad', chains=8)
    idata_smc = pm.sample_smc(cores=1, chains=4)

Be aware that gradients for the generalized eigenvalue problem (eig) are NOT implemented. Only gradients for eigh are implemented, but actually you should use eigh here because it seems like A is symmetric (as long as this will be true in the 3x3 case). eigh is also supposed to return the eigenvalues and vectors in ascending order, but it didn’t seem to in my testing, which might be a bug.

Unfortunately, we didn’t vectorize eigh yet – vectorized linear algebra functions are quite new. So I had to vectorize it myself, and you can see how that looks. It’s quite painless.The signature "(m,m)->(m),(m,m)" tells pytensor what the core dimension are: the input will be an (m,m) matrix, and it will output a vector of length (m) and a matrix of shape (m,m). Once it’s vectorized, you’re allowed to add as many batch dimensions are you like to the left of the core dims. In our case, A is (100, 2, 2) so m=2, and the batch dim is the first one of size 100. The eigenvalue computation is automatically broadcast across the batch dimension.

Anyway, what is a bit painful is sampling the model – the strong multimodality makes it a bit rough on NUTS. I had much better success with SMC. Here’s idata_nuts:

And here’s idata_smc:

If in the general case you don’t have symmetric A matrices, you must use eig, not eigh.
:warning: eigh does NOT check that the input matrix is actually symmetrical :warning:. It will just happily return incorrect eigenvalues if you give it a non-symmetric matrix

The upshot of SMC working well is that eig doesn’t have gradients implemented, so you’ll have to use SMC in that case.

5 Likes