How to Set Subtensor in a Random Variable Matrix to 0

NSand · August 14, 2024, 6:02pm

Hello all,

I’m fairly new to PyMC. The problem I have is the following: I am generating a PyMC model which has some matrix of random variables (weights) which map a vector of some input dimension, d, to the output (target) dimension, N.
In particular, I know that some of these weights must be zero a prioi, and I know which ones must be zero. So my question is this: is there a “best” way to set these particular elements to 0 within the matrix of random variables? I will explain my concerns after the following toy model.

import numpy as np
import pymc as pm
import pytensor as pt
import arviz as az

# toy example
d = 5
N = 5
N_data = 100

zero_indices = (np.array([0,0,3]), np.array([1, 2, 4]))

x = np.random.randn(N_data ,d)
weights = np.random.randn(N,d)
weights[zero_indices] = 0
y_gt = (weights @ x.T).T # shape (N_data, N)


print(weights.round(1))

>>> out: [[ 0.4  0.   0.   0.1  1.1]
          [-2.  -0.7 -0.8 -0.3  2.6]
          [-0.2 -0.5 -1.   0.9  0.9]
          [-0.6 -0.1  0.4  0.9  0. ]
          [ 0.8  1.5  0.2 -1.1 -0.8]]

with pm.Model() as model:
    fit_weights = pm.Normal('fit_weights', mu = 0, sigma = 1, shape = (N,d))

    # set weights at indices to zero here
    # fit_weights  = pt.tensor.subtensor.set_subtensor(fit_weights[zero_indices ], 0)

    out = pm.Normal('y_obs', mu = pm.math.dot(fit_weights, x.T).T, sigma = 0.1,
    observed = y_gt,
    shape = (N_data, N))
    
    trace = pm.sample()

So my main concern about just using pt.set_subtensor is that the random variable that is replaced by 0 is still instantiated and will lead to inefficiencies when sampling (it still samples despite not having any involvement in the likelihood). Is there some way to keep this convenient matrix declaration or do I have to do something like create a pt.zeros and then loop through and set each element of the weights individually as a pm.Normal (seems not as convenient)?

Cheers,

Nolan

ricardoV94 · August 14, 2024, 9:07pm

You can create only the variables you need in a flattened vector and then use boolean indexing to set those. Something like:

nonzero_mask = np.array([[1, 1], [1, 0]]], dtype=bool)
zero_w = pt.zeros(zero_mask.shape)
nonzero_w = pm.Normal(..., shape=nonzero_mask.sum())
w = zero_w[nonzero_mask].set(nonzero_w)

Topic		Replies	Views
Inserting random variable into a matrix	2	98	April 24, 2024
PyMC3 append tensor variables Questions	1	716	January 31, 2022
Mass Matrix contains zero Error. Scaling data with a diagonal matrix of random variables Questions	6	479	December 11, 2020
Help with Matrix-Based Inference (A toy problem) modeling	1	35	December 16, 2024
Vector hyperparameter for multivariate normal v5	1	99	May 3, 2024

How to Set Subtensor in a Random Variable Matrix to 0

Related topics