Casting matrix inversion error

Alfarisy · April 17, 2024, 3:38pm

I have issue with the deterministic and sampling function. The sampler is consist of three priors (a0, a1, a2) that will be acting as mean function to a simple finite element function. The displacement obtained from FE (uFE) will act as the input to likelihood function in MvNormal. There are several matrix operations in the FE (FEpyMCsoln) and why I can’t use the a0, a1, a2 directly from the prior sampling to matrix inversion. As this likely be the case for this error. Help is deeply appreciated.

# Define FEpyMCsoln function
def FEpyMCsoln(x_c, mean):
    global_stiffness_results = []

    for j in range(len(x_c)):
        element = Element(centre=x_c[j], node_ids=[j], diffuse=mean[j], max_ids=len(x_c), ident=[j+1])
        global_stiffness = element.giveGlobalStiffness()
        global_stiffness_results.append(global_stiffness)

    # Combine global stiffness matrices to get the final global stiffness matrix
    A = np.sum(global_stiffness_results, axis=0)  # Sum along the appropriate axis
    Ainv = np.linalg.inv(A)
    BC = np.ones((len(A)-1))
    ufe = np.dot(Ainv[1:, 1:], BC)
    return ufe

# Assuming x_c and u are defined elsewhere
# Define perturbation
perturb = 0.05 * np.max(u[:,0]) 

# Generate perturbed data in NumPy
y = u + np.random.normal(loc=0.0, scale=perturb, size=u.shape)

# Flatten x_c to a numpy array
x_cent = x_c

# Create covariance matrix
covariance = np.eye(len(x_cent)) * perturb

# Define the model
with pm.Model() as model:
    # Priors
    a0 = pm.Normal('a0', mu=1., sigma=0.5)
    a1 = pm.Normal('a1', mu=0.25, sigma=0.5)
    a2 = pm.Normal('a2', mu=0.05, sigma=0.5)

    mean = np.exp(a0 + a1 * np.sin(np.pi * x_cent) + a2 * np.sin(2 * np.pi * x_cent))
    
    # Define FEpyMCsoln as a deterministic pm
    uFE = pm.Deterministic('u', FEpyMCsoln(x_c, mean))

    # Likelihood
    likelihood = pm.MvNormal('y', mu=uFE, cov=covariance, observed=y) 

    # MCMC sampling
    trace = pm.sample(1000, tune=1000)

The error is as follows.

UFuncTypeError: Cannot cast ufunc 'inv' input from dtype('O') to dtype('float64') with casting rule 'same_kind'

jessegrabowski · April 17, 2024, 4:11pm

Hi Alfarisy!

You can’t use numpy functions in PyMC models. In some cases it works, but it’s a happy accident. The computational backend for PyMC is pytensor – see here for some discussion on how it all works. You should replace everything in your model with, for example, pt.linalg.inv.

Specifically, you are getting an error because you are passing a symbolic tensor (a PyMC variable) to numpy, which is expecting an array of numbers to compute on.

You will encounter some other challenges porting this code over with the loop and the Element object. You will need to either refactor your code or wrap it into pytensor Op, as demonstrated here.

Alfarisy · April 19, 2024, 1:17pm

Hi Jesse,

Thank you for the recommendation to change to pytensor and I have changed the numpy functions accordingly. I see also that numpy array can be used as the input to the PyTensor based on the basics provided as in link here.

Just wanted to validate my implementation as when I tried to do the inversion with PyTensor, it gives “IndexError: too many indices for array” during the usage of pt.nlinalg.tensorsolve. The input is one from the pt.sum (A) and one from np.ones (numpy array). Advice is appreciated, thank you in advance.

class ElementMC:
    def __init__(self, centre, node_ids, diffuse, max_ids, ident):
        self.centre = centre
        self.glob_ids = node_ids
        self.g_id = ident
        self.diffusivity = diffuse
        self.max_id = max_ids

    def stiffness(self):
        if self.glob_ids == [self.max_id-1]:
            return np.array([[1/nx*area*Ey, -1/nx*area*Ey], [-1/nx*area*Ey, 2/nx*area*Ey]])
        else:
            return np.array([[1/nx*area*Ey, -1/nx*area*Ey], [-1/nx*area*Ey, 1/nx*area*Ey]])

    def giveProjection(self):
        n_dofs = self.max_id + 1  # Use self.max_id
        matR = np.zeros((n_dofs, 2))
        matR[self.glob_ids, 0] = 1.
        matR[self.g_id, 1] = 1.
        
        return matR

    def giveLocalStiffness(self):
        return self.diffusivity * self.stiffness()

    def giveGlobalStiffness(self):
        R = self.giveProjection()
        loc_stiff = self.giveLocalStiffness()
        global_stiffness = pt.nlinalg.matrix_dot(pt.nlinalg.matrix_dot(R, loc_stiff), R.T)
        return global_stiffness

#Define BC
BC = np.ones((len(x_c)-1))

# Define FEpyMCsoln function
def FEpyMCsoln(theta, x_c, BC):
    a0, a1, a2 = theta
    mean = a0 + a1 * np.sin(np.pi * x) + a2 * np.sin(2 * np.pi * x)
    global_stiffness_results = []

    # Assuming Element class and giveGlobalStiffness function are defined elsewhere
    for j in range(len(x_c)):
        element = ElementMC(centre=x_c[j], node_ids=[j], diffuse=mean[j], max_ids=len(x_c), ident=[j+1])
        global_stiffness = element.giveGlobalStiffness()
        global_stiffness_results.append(global_stiffness)

    # Combine global stiffness matrices to get the final global stiffness matrix
    A = pt.sum(global_stiffness_results)  # Sum along the appropriate axis
    ufe = pt.nlinalg.tensorsolve(A[1:, 1:], BC)
    return ufe

# define a pytensor Op for our likelihood function
class uFElikelihood(pt.Op):

    itypes = [pt.dvector]  # expects a vector of parameter values when called
    otypes = [pt.dvector]  # outputs a vector of parameter values as mu

    def __init__(self, uFE, x, BC):

        # add inputs as class attributes
        self.mu = uFE
        self.x = x
        self.BC = BC

    def perform(self, node, inputs, outputs):
        # the method that is used when calling the Op
        (theta,) = inputs  # this will contain my variables

        # call the log-likelihood function
        myu = self.mu(theta, self.x, self.BC)

        outputs[0][0] = np.array(myu)  # output the log-likelihood

# Define perturbation
perturb = 0.05 * np.max(u[:,0]) 

# Generate perturbed data in NumPy
y = u + np.random.normal(loc=0.0, scale=perturb, size=u.shape)

# Flatten x_c to a numpy array
x_cent = x_c

# Create covariance matrix
covariance = np.eye(len(x_cent)) * perturb

#create op
myu = uFElikelihood(FEpyMCsoln, x_c, BC)

# Define the model
with pm.Model() as model:
    # Priors
    a0 = pm.Normal('a0', mu=0., sigma=1.)
    a1 = pm.Normal('a1', mu=0., sigma=1.)
    a2 = pm.Normal('a2', mu=0., sigma=1.)

    #sig = pm.HalfNormal('sig', sigma = 1.)

    theta = pt.as_tensor_variable([a0, a1, a2])

    # Likelihood
    likelihood = pm.Normal('y', mu=myu(theta), sigma = perturb, observed=y) 

    # MCMC sampling
    trace = pm.sample(1000, tune=1000)

jessegrabowski · April 19, 2024, 1:44pm

I imagine you probably want just pt.linalg.solve, not tensorsolve. What are the dimensions of A and BC?

If you are going to wrap everything into an Op, you don’t need to use pytensor functions. Basically you have two choices:

Express everything using only pytensor functions. This means you don’t need a custom Op anywhere, and you will automatically get access to gradients.
Wrap your numpy code in a pytensor Op. In this case you don’t need to use pytensor functions, because everything in the perform method is just python code. You will not have gradients in this case, and will have to use a non-NUTS sampler.

In either case, what you dont want to do is mix pytensor and numpy. Pick one and implement it.

That said, unless you have extremely complex control flow, exotic linear algebra routines, or something else highly specialized, you are better off picking option 1. 99% of the time people think they’re in case 2, but actually they’re in case 1. It looks like this is probably the case for you. I’d try dumping this object oriented approach in favor of something more functional, and see if you can achieve the same results by using only pytensor functions everywhere (change all np. to pt.). Basically, break out the class methods into regular functions. You’re not doing anything object oriented anyway, you just make an Element, call a single method, then throw it away. You can get rid of the loop inside FEpyMCsoln by writing functions for single inputs then using pt.vectorize to broadcast your function across a batch of inputs (it works the same as np.vectorize).

Alfarisy · May 8, 2024, 3:15pm

Thanks, the problem is now solved and the code is running. The solution is not using the Op function, but try to use all function based on the pyTensor and not mixing up numpy and pyTensor.

Topic		Replies	Views
Errors with sampling	3	191	December 5, 2023
Sampling error when using complex tensor function as distribution parameter Questions	3	450	September 11, 2018
Problem with using pm.Deterministic to return a matrix when using a function as input to pm.deterministic and having prior as a argument to that function development , modeling	2	85	July 1, 2024
Errors when using pytensor v5	0	211	November 28, 2023
Initial evaluation of child variable fails with -inf v5	1	278	October 29, 2022

Casting matrix inversion error

Related Topics