Build model with varying combination of RVs

ae-powell · September 15, 2024, 12:22am

I am attempting build a model off data where I have a large number of RVs (beta). However, each of these variables is not needed in the linear combination for each observation. Rather than set the unused variables to 0 for a particular observation, I’d rather ignore in the model construction – my intuition is that this could help with both code complexity and computation efficiency (could be wrong?). Here is roughly what I’m attempting to do, but am having trouble setting the right types in the array – would love some guidance on 1) is this possible and 2) how might I go about successfully setting it up.

with pm.Model(coords=coords) as model:
    beta = pm.Normal("beta", mu=0, sigma=1, dims='idx')

mu_list = []

for _id, group in id_groups:
    # some computation here off the dataset...
    # sign is set to 1 or -1 depending
    # Sum contributions of the 10 elements for this ID
    mu_id = (beta[idx] * sign).sum()
    mu_list.append(mu_id)

# Combine all contributions for each ID
mu_combined = np.array(mu_list)

with model:
    sigma = pm.HalfNormal("sigma", sigma=1)
    value_obs = pm.Normal("value_obs", mu=mu_combined, sigma=sigma, observed=grouped_means)

jessegrabowski · September 15, 2024, 7:34am

At first blush this seems like a case of premature optimization. This is because BLAS routines are some of the most optimized, highest performance code out there. To beat X @ beta, you really need to be in a special situation. Do you have a working model that does it the “wasteful” way and has unacceptably slow performance that you can share?

If the feature matrix is very sparse, you could try going that route as an alternative. I think you need something like 5% or fewer non-zero entries before it’s “worth it”.

Topic		Replies	Views
Proper way to model several variables v5	4	1229	August 8, 2022
Is This The Proper Way to Use Observed RVs of Different Lengths? Questions	4	704	January 8, 2019
Observed RVs of Different Lengths in a Linear Model Questions	2	877	November 2, 2018
Multiple Linear Regression Questions	2	657	April 26, 2019
Prediction new data v3	1	408	March 9, 2022

Build model with varying combination of RVs

Related topics