Handling Likelihood Computation with Matrix Priors

Hi,

I’m working with a hierarchical model where the likelihood involves matrix operations. Specifically, the likelihood follows a Bernoulli distribution, with its parameter determined by the product of two matrices:

  1. One matrix is derived from a normal prior.
  2. The other is based on a categorical prior.

To define these matrices, I use np.zeros and pm.Data, and since I need to work with observed values, I use .eval(). The Bernoulli parameter is simply the product of these matrices.

The issue: The parameter is always zero, regardless of the dataset, making the likelihood zero and resulting in nan values when computing waic. Additionally, I can’t print each parameter to debug the issue.

My questions:

  1. When I check the model parameters, the matrices don’t seem to be included, so I suspect the likelihood isn’t computed because I’m multiplying objects that are not part of the model. How can I ensure these matrices are recognized as part of the model?
  2. Is there an alternative way to define these priors as matrices solely for the likelihood calculation?

Thanks for your help!

import arviz as az
import matplotlib.pyplot as plt
import numpy as np
import pytensor.tensor as pt
from scipy.special import expit as logistic
import pymc as pm
 
def custom_sigmoid(x):

    return pm.math.exp(x) / (1 + pm.math.exp(x))
 
def create_model(data_matrix, num_nodes, num_blocks, a_alpha, b_alpha, a_tau, b_tau, mu_zeta, sigma_zeta):

    data_matrix = np.arange(num_nodes * num_nodes).reshape(num_nodes, num_nodes) 
    data_matrix = (data_matrix + data_matrix.T) / 2  
    data_vector = data_matrix.reshape(-1)  
    reconstructed_matrix = data_vector.reshape(num_nodes, num_nodes)  

    with pm.Model() as SBMmodel:
    
        alpha = pm.Gamma("alpha", alpha=a_alpha, beta=b_alpha)       
        omega = pm.Dirichlet("omega", a=np.ones(num_blocks) * (alpha / num_blocks), shape=num_blocks)          
        E_row = pm.Categorical('E_vector', p=omega, shape=num_nodes)           
        tau2 = pm.InverseGamma("tau2", alpha=a_tau, beta=b_tau)
        zeta = pm.Normal("zeta", mu=mu_zeta, sigma=np.sqrt(sigma_zeta))
        
        Theta_kl = pm.Normal('Theta_klfull', mu=zeta, sigma=np.sqrt(tau2),
                                shape=(num_blocks * (num_blocks + 1)) // 2)
        theta_matrix_np = np.zeros((num_blocks, num_blocks))
        Theta_matrix = pm.Data("ctheta_matrix", theta_matrix_np)
        index = 0
        for i in range(num_blocks):
            for j in range(i, num_blocks):
                theta_val = Theta_kl[index].eval()
                theta_matrix_np[i,j] = theta_val
                theta_matrix_np[j,i] = theta_val
                index += 1
        
        E_matrix_np = np.zeros((num_blocks, num_nodes))
        E_row_value = E_row.eval()
        for node_index in range(num_nodes):
            block_index = E_row_value[node_index]
            E_matrix_np[block_index, node_index] = 1
        
        E_matrix = pm.Data("E_matrix", E_matrix_np)
       
        
        def calculate_lambda(E_matrix, Theta_matrix):
            full_matrix = pt.dot(pt.dot(E_matrix.T, Theta_matrix), E_matrix)
            return full_matrix

        Lambda = pm.Deterministic('Lambda', calculate_lambda(E_matrix, Theta_matrix))
        Lambda1 = pm.Deterministic("Lambda1", custom_sigmoid(Lambda))
      
        Likelihood = pm.Bernoulli("Y_obs", p=Lambda1, observed=reconstructed_matrix)

    return SBMmodel

Three quick things:

  • you never want to to use numpy functions inside a PyMC model
  • you never want to call .eval() on a random variable inside a PyMC model definition
  • you never want to use loopsinside a PyMC model definition

You should use PyTensor functions (which mirror the numpy API) instead of numpy functions and vectorize any loops. Doing both of these things will alleviate the need to call .eval().

There are a couple of catches (e.g., things like theta_matrix_np[j,i] = theta_val needs to be re-written using pt.set_subtensor()^1), but otherwise you should be able to adapt your code relatively straightforwardly.

^1 You may not need this once you vectorize this loop.