Hello all,
I’m fairly new to PyMC. The problem I have is the following: I am generating a PyMC model which has some matrix of random variables (weights) which map a vector of some input dimension, d
, to the output (target) dimension, N
.
In particular, I know that some of these weights must be zero a prioi, and I know which ones must be zero. So my question is this: is there a “best” way to set these particular elements to 0 within the matrix of random variables? I will explain my concerns after the following toy model.
import numpy as np
import pymc as pm
import pytensor as pt
import arviz as az
# toy example
d = 5
N = 5
N_data = 100
zero_indices = (np.array([0,0,3]), np.array([1, 2, 4]))
x = np.random.randn(N_data ,d)
weights = np.random.randn(N,d)
weights[zero_indices] = 0
y_gt = (weights @ x.T).T # shape (N_data, N)
print(weights.round(1))
>>> out: [[ 0.4 0. 0. 0.1 1.1]
[-2. -0.7 -0.8 -0.3 2.6]
[-0.2 -0.5 -1. 0.9 0.9]
[-0.6 -0.1 0.4 0.9 0. ]
[ 0.8 1.5 0.2 -1.1 -0.8]]
with pm.Model() as model:
fit_weights = pm.Normal('fit_weights', mu = 0, sigma = 1, shape = (N,d))
# set weights at indices to zero here
# fit_weights = pt.tensor.subtensor.set_subtensor(fit_weights[zero_indices ], 0)
out = pm.Normal('y_obs', mu = pm.math.dot(fit_weights, x.T).T, sigma = 0.1,
observed = y_gt,
shape = (N_data, N))
trace = pm.sample()
So my main concern about just using pt.set_subtensor
is that the random variable that is replaced by 0 is still instantiated and will lead to inefficiencies when sampling (it still samples despite not having any involvement in the likelihood). Is there some way to keep this convenient matrix declaration or do I have to do something like create a pt.zeros
and then loop through and set each element of the weights individually as a pm.Normal
(seems not as convenient)?
Cheers,
Nolan