Multivariate Linear Regression with correlated input variables

HNLala · April 28, 2021, 1:57am

Hello,
I am trying to fit a multivariate linear regression (Y = alpha + beta1 * X1 + beta2 * X2) wherein my 2 input variables (X1, X2) are correlated.
I have uncertainties for all the variables (Y, X1, X2). I am using the following code to estimate the values for the intercept and the slopes:

with pm.Model() as model:
    # Priors for unknown model parameters 
    alpha = pm.Normal("alpha", mu=0, sigma=10)
    beta1 = pm.Normal("beta1", mu=0, sigma=10)
    beta2 = pm.Normal("beta2", mu=0, sigma=10)

    # Prior for covariance
    chol, corr, stds = pm.LKJCholeskyCov(
        "chol", n=2, eta=2.0, 
        sd_dist=pm.Exponential.dist(1.0), compute_corr=True
    )
    
    μ = pm.Normal("μ", 0., 10., shape=2, testval=test_values) 

    obs = pm.MvNormal("obs", μ, chol=chol, observed=observed_df)
    
     # Likelihood
     mu = alpha + beta1 * obs[:, 0] + beta2 * obs[:, 1]
     Y_obs = pm.StudentT("Y_obs", mu=mu, sigma=y_error,
                        observed=y, nu=5)

y and y_error are the dependent variable ‘Y’ and its uncertainty, respectively.
observed_df is a DataFrame with X1 and X2 as columns.

How can I incorporate the uncertainties on X1 and X2 in my model?

jonsedar · May 5, 2021, 9:20am

Hi, a couple of previous threads that might help:

HNLala · May 5, 2021, 10:29am

Hi,
Thanks for the links. However, neither thread talks about correlated input variables. That’s the main part I would like to tackle.

How can one incorporate the uncertainties with this MvNormal construct?

obs = pm.MvNormal("obs", μ, chol=chol, observed=observed_df)

The observed argument here only accepts the X1 and X2 observed data and not their uncertainties.

HNLala · May 5, 2021, 10:37am

Is this a possible solution? The changes w.r.t. the above block of code are in the section immediately above the declaration of the likelihood.

with pm.Model() as model:
    # Priors for unknown model parameters 
    alpha = pm.Normal("alpha", mu=0, sigma=10)
    beta1 = pm.Normal("beta1", mu=0, sigma=10)
    beta2 = pm.Normal("beta2", mu=0, sigma=10)

    # Prior for covariance
    chol, corr, stds = pm.LKJCholeskyCov(
        "chol", n=2, eta=2.0, 
        sd_dist=pm.Exponential.dist(1.0), compute_corr=True
    )
    
    μ = pm.Normal("μ", 0., 10., shape=2, testval=test_values) 

    obs = pm.MvNormal("obs", μ, chol=chol, observed=observed_df)

    # Update
    x1_split, x2_split = tt.split(obs, [1, 1], n_splits=2, axis=1)
    x1_split = tt.squeeze(x1_split)[:, 0]
    x2_split = tt.squeeze(x2_split)[:, 0]

    X1 = pm.Normal("X1", mu=x1_split, sigma=x1_error, shape=len(x1))
    X2 = pm.Normal("X2", mu=x2_split, sigma=x2_error, shape=len(x2))    


     # Likelihood
     mu = alpha + beta1 * X1 + beta2 * X2
     Y_obs = pm.StudentT("Y_obs", mu=mu, sigma=y_error,
                        observed=y, nu=5)

jonsedar · May 5, 2021, 10:48am

Aha, I see. Then more questions, sorry:

Do you want to estimate the covariance of X1 and X2?
Do you think the errors in X1 and X2 might also covary and you and want to estimate that too?

Cheers, Jon

HNLala · May 5, 2021, 12:04pm

Thanks for the follow up. I just wanted to estimate the covariance of X1 and X2. However, I will be interested in knowing how to do the latter as well.

HNLala · June 5, 2021, 3:17pm

Hi again,
Does anyone have any ideas on how to do this?

twiecki · June 9, 2021, 12:35pm

I think you’ll want your two betas to be MvNormals instead of Normals. There you can then specify a prior for the correlation between the two.

Topic		Replies	Views
Measurement Uncertainties in Multiple Regression Questions	3	1405	April 28, 2020
Multivariant linear regression with uncertainty (error propagation) Questions modeling	2	1535	September 7, 2022
Multivariate Random Walk Regression v5	3	338	December 8, 2023
Covariance Estimation including uncertainties Questions covariance	4	965	July 3, 2022
Dealing with multivariate x, errors in both x and y, and posterior predictive for y Questions	9	3117	March 14, 2018

Multivariate Linear Regression with correlated input variables

Related topics