Is this the correct way to do multivariate regression without using GLM?

mldl920 · July 17, 2020, 7:54pm

Hi how’s it going?

I wanted to do multivariate regression manually instead of GLM, and I just want to make sure this is the correct implementation, in the most basic sense, choice of priors and numbers disregarded.

with pm.Model() as m_5_1:
    a = pm.Normal("a", 10,5)
    bA = pm.Normal("bA",10,5)
    bB = pm.Normal("bB",10,5)
    sigma = pm.Uniform("sigma", 0,4)
    
    mu = pm.Deterministic("mu", a + bA * x['var1']) + bB * x['var2']

    result = pm.Normal(
        "result",mu=mu, sigma=sigma, observed=y.values
    )
    trace = pm.sample()

and then for predictions on new data I’m doing…

newdata = pd.read_csv('newdata.csv')
number_of_rows_in_newdata = newdata.shape[0]

new_data_0 = xr.DataArray(
    newdata['var1'],
    dims=["pred_id"]
)

new_data_1 = xr.DataArray(
    newdata['var2'],
    dims=["pred_id"]
)

pred_mean = (
    trace["a"][:number_of_rows_in_newdata] +
    trace["bA"][:number_of_rows_in_newdata] * new_data_0 +
    trace["bB"][:number_of_rows_in_newdata] * new_data_1

)

predictions = xr.apply_ufunc(lambda mu, sd: rng.normal(mu, sd), pred_mean, trace["sigma"][:number_of_rows_in_newdata])

Is there anything I’m doing wrong here or is this the correct implementation on multivariate regression and subsequent out of sample predictions?

Thanks!

ckrapu · July 20, 2020, 4:32am

This is a correct implementation of multiple regression which is different from multivariate regression. Multivariate regression typically refers to a multivariate outcome instead of a multivariate predictor. Here, you have a scalar outcome. Otherwise, everything looks fine.

mldl920 · July 20, 2020, 11:21pm

Cool thanks so much for the feedback!

Topic		Replies	Views
How do I predict on new, unseen data using GLM? Questions	3	1397	July 13, 2020
Dealing with multivariate x, errors in both x and y, and posterior predictive for y Questions	9	3089	March 14, 2018
Proper way to model several variables v5	4	1232	August 8, 2022
Rolling regression with multivariate stored in a pandas dataframe v5 modeling	0	617	September 28, 2022
About set of Multivariate Normal Distributions Questions	2	1165	July 3, 2018

Is this the correct way to do multivariate regression without using GLM?

Related topics