Help formulating a complex matrix multiplication in pymc

For a problem I’m working on I want to multiply complex matrices together so that I can then for the simplest case is slice out a specific element of this final complex matrix and do the square modulus of that element. However the catch is that PyMC (well specifically Pytensor) models don’t like complex numbers (for good reason) and so I need to circumvent the complex numbers entirely.

So in essence say I want U_f=U_1 U_2 U_3, where U are complex matrices. I know the expressions for U_i in advance and so I have A_i and B_i for U_i=A_i + j B_i and what I want out is A_f and B_f without actually creating any complex numbers.

My suspicion is that since U_f=(A_1+j B_1)(A_2+j B_2)(A_3+j B_3) that I can make use of Pascal’s triangle/binomial theorem kind of logic on just (A_1+B_1)(A_2+B_2)(A_3+B_3) since summing terms with odd numbers of B, e.g. A_1 A_2 B_3, will give B_f and the rest of the terms will sum to A_f, but this could still be very awkward to implement. Especially since I’m unsure of any readily available binomial theorem type of functions (most likely due to the general non commutativity of matrices).

Is there an easier way of going about this that jumps out at anyone? Thanks in advance.

CC @aseyboldt

I might be missing something, but couldn’t you implement it like this?

def complex_matmult(A, B):
    """"Given `A = A + i * Ai` and `B = B + i * Bi` compute the real and imaginary comonents of `AB`"""
    A, Ai = A
    B, Bi = B
    C = A @ B - Ai @ Bi
    Ci = A @ Bi + Ai @ B
    return C, Ci

U1 = np.random.randn(N, N)  # or some pytensor thing
U1i = np.random.randn(N, N)
U1 = (U1, U2)
U2 = (...., ...)
U3 = (..., ...)

U1U2 = complex_matmult(U1, U2)
Uf = complex_matmult(U1U2, U3)
Uf_real, Uf_imag = Uf
1 Like

That should probably work yeah, half of what I’m keeping in mind is generalising to an arbitrary number of U_i but I can just lump that into a for loop and work through like you’ve presented. Just me overcomplicating a simple problem lol :man_facepalming:

Many thanks!

I could do with an additional prompt on trying to get this code working properly. The solution for multiplying these matrices together works as intended, and my implementation looks like so:

def construct_BS_pymc_real(RV):
    #return pt.stack([pm.math.sqrt(RV), 1j*pm.math.sqrt(1-RV), 1j*pm.math.sqrt(1-RV), pm.math.sqrt(RV)]).reshape((2,2))
    return pt.stack([pm.math.sqrt(RV), 0, 0, pm.math.sqrt(RV)])
    
def construct_BS_pymc_imag(RV):
    #return pt.stack([pm.math.sqrt(RV), 1j*pm.math.sqrt(1-RV), 1j*pm.math.sqrt(1-RV), pm.math.sqrt(RV)]).reshape((2,2))
    return pt.stack([0, pm.math.sqrt(1-RV), pm.math.sqrt(1-RV), 0])
    
def construct_PS_pymc_real(RV):
    #return pt.stack([pm.math.exp(1j*RV/2),0,0,pm.math.exp(-1j*RV/2)]).reshape((2,2))
    return pm.math.sin(RV)
    
def construct_PS_pymc_imag(RV):
    #return pt.stack([pm.math.exp(1j*RV/2),0,0,pm.math.exp(-1j*RV/2)]).reshape((2,2))
    return pm.math.cos(RV)
    
def complex_matmult_pymc(A, B):
    """"Given `A = A + i * Ai` and `B = B + i * Bi` compute the real and imaginary comonents of `AB`"""
    A, Ai = A
    B, Bi = B
    C = pt.dot(A,B) - pt.dot(Ai, Bi)
    Ci = pt.dot(A, Bi) + pt.dot(Ai, B)
    return C, Ci
    
def circuit_list_to_matrix_pymc(feature):
    #Had to be quirky in construct_BS_pymc since set_subtensor doesn't like replacing 2D tensors or rows or even element by element (breaks down after one element) so had to return a flat tensor and reassign half of the array and then the other half.
    index=feature[2]-1
    if feature[0]=="BS":
        real=pt.eye(n=m,m=m)
        real=pt.set_subtensor(pt.flatten(real)[index:index+2],construct_BS_pymc_real(feature[1])[0:2])
        real=pt.set_subtensor(pt.flatten(real)[index+2:index+4],construct_BS_pymc_real(feature[1])[2:4])
        real=real.reshape((2,2))

        imag=pt.eye(n=m,m=m)
        imag=pt.set_subtensor(pt.flatten(imag)[0:2],construct_BS_pymc_imag(feature[1])[0:2])
        imag=pt.set_subtensor(pt.flatten(imag)[2:4],construct_BS_pymc_imag(feature[1])[2:4])
        imag=imag.reshape((2,2))
        #print(real.eval())
        #print(imag.eval())
        matrix=[(real,imag)]*N #To copy it N times to help matrix dotting across to get final U
        #print(matrix)
    if feature[0]=="PS":
        matrix=[]
        for _ in range(N):
            real=pt.eye(n=m,m=m)
            real=pt.set_subtensor(real[index,index],construct_PS_pymc_real(feature[1][0][_]))

            imag=pt.eye(n=m,m=m)
            imag=pt.set_subtensor(imag[index,index],construct_PS_pymc_imag(feature[1][0][_]))

            matrix.append((real,imag))
    #print(feature[0])
    #print(matrix)
    return matrix #return output matrix

with pm.Model():

    """
    Free parameters to infer
    """
    eta=pm.TruncatedNormal("eta",mu=0.5,sigma=0.05,lower=0.0,upper=1.0,initval=0.5)  #array of 
    theta=pm.Deterministic("theta",2*pt.arccos(pt.sqrt(eta)))
    #priors for conciseness
    #sd=a_dev
    a=pm.TruncatedNormal("a", mu=0, sigma=a_dev,lower=-np.pi,upper=np.pi,initval=0)  #array of priors for conciseness
    #sd=0.07
    b=pm.Normal("b", mu=0.7, sigma=0.7,initval=0.7) #array of priors for conciseness
       
    #Volt=pm.Normal("Volt",mu=V_2_dist,sigma=0.1)
    Volt=pm.Deterministic("Volt",pt.as_tensor(V))

    #below expression breaks down when there is just 1 a and b
    #phi=pm.Deterministic("phi",a[:,None]+b[:,None]*pm.math.sqr(Volt))
    """
    phi describes the different phase shifts for different experiments
    """
    phi=pm.Deterministic("phi",a+b*pm.math.sqr(Volt))

    circuit_list=[["BS",eta,1],["PS",phi,1]] #Need to reverse this order for it to be correct
        
    U_list = np.array([circuit_list_to_matrix_pymc(feature) for feature in circuit_list])
    #U_list is an array that I need to dot across but dot via complex_matmul function that I've defined
    #U=pt.nlinalg.matrix_dot(U_list) #Doesn't work raw since PS is a list of N matrices for the N experiments
        
        
    U=[] #To store final mode Unitaries: U=[(U1,U1i),(U2,U2i),...,(UN,UNi)]

    for i in range(N):
        rval = U_list[:,i][0]
        for a in U_list[:,i][1:]:
            rval=complex_matmult_pymc(rval,a)
            U.append(rval)
        
    """
    Indexing specific elements from each array
    """
    Utopreal=[elem[0][0][0] for elem in U] #top left element of each real matrix in U
    Utopimag=[elem[1][0][0] for elem in U] #top left element of each imag matrix in U
    Ubotreal=[elem[0][1][0] for elem in U] #bottom left element of each real matrix in 
    Ubotimag=[elem[1][1][0] for elem in U] #bottom left element of each imag matrix in U

    """
    Big slowdown when attempting to call sampling, text indicating initialisation doesn't even show up
    """
    #P=pm.math.stack([pt.nlinalg.norm(pm.math.stack([Utopreal,Utopimag],axis=-1),ord='fro',axis=-1)**2,pt.nlinalg.norm(pm.math.stack([Ubotreal,Ubotimag],axis=-1),ord='fro',axis=-1)**2])
    P=pm.math.stack([pm.math.sqr(Utopreal)+pm.math.sqr(Utopimag),pm.math.sqr(Ubotreal)+pm.math.sqr(Ubotimag)],axis=-1)
    #print(P.eval()) #Works as expected
        
    likelihood=pm.Multinomial("likelihood",n=C,p=P,shape=(N,m),observed=data)
        
    trace=pm.sample(draws=int(1e3), chains=4, cores=cpucount, return_inferencedata=True)

The problem is that when i run this code (full file also attached) it takes forever to initialise, the sampler on sampling being called which I’ve heard is because ‘lazy computation’ is done where it will only actually do computation when sampling is called. Now as to why it takes forever to initialise, I’m assuming it could be because of these repeated for loops and arrays that are making the graphs messy? I’ve drummed up some code to use pytensor.scan in case that would be more appropriate for this recursive complex matrix multiplication from the earlier post but I’m not sure on how to implement within this model block:

k = pt.iscalar("k")
A=pt.matrix("A")
Ai=pt.matrix("Ai")
B=pt.matrix("B")
Bi=pt.matrix("Bi")

def mat_compmult(A,Ai,B,Bi):
    #A, Ai = A
    #B, Bi = B
    C = pt.dot(A,B) - pt.dot(Ai, Bi)
    Ci = pt.dot(A, Bi) + pt.dot(Ai, B)
    return C,Ci

# Symbolic description of the result
result, updates = pytensor.scan(fn=mat_compmult,
                            outputs_info=[A,Ai],
                            non_sequences=[B,Bi],
                            n_steps=k)

final_result = result[-1]

power = pytensor.function(inputs=[A,Ai, B,Bi,k], outputs=final_result,
                      updates=updates)

print(power(np.ones((2,2)), np.ones((2,2)), np.ones((2,2)), np.ones((2,2)), 2))

Any help would be appreciated!

Clarification on context:
To clarify what is going on in the model block {eta,a,b} are parameters I want to infer. ‘phi’ comes out as an array since my experimental setup has me run N experiments so I run N different voltages (stored in the array ‘Volt’). So what I do is using circuit list and circuit_list_to_matrix_pymc I iterate through circuit_list to get U_list, and matrix dot across to get the final matrix for each of the N experiments. As per the earlier post these matrices are complex so at each step of the process the matrix is stored as a tuple of the real components matrix and the imaginary components and making use of the complex matmul for this dotting across to get the final result being the real and imaginary components of the final matrix for each of the N experiments. Finally for this specific case i can get the probabilities I need or the multinomial likelihood for a given experiment by slicing out the left column of the complex matrix and square modulus’ing elementwise, so for the N experiments and having only the real and imag components I opted to slice out the elements as individual arrays and then squaring and adding the appropriate matrices. My debugging has this check out as expected. This has been quite tricky to word, and so I can whip up some slides to illustrate what is going on more clearly.

Polished.py (14.1 KB)