Prediction by Bayesian linear regression

Hello all,

I am quite new to Python and Bayesian Linear Regression. I could write a code for the Bayesian Linear Regression model (Although I don’t know if it is good or not) to predict a SINGLE “x”. However, my data consists of 3 independent variables and 1 dependent variable and I like to predict my dependent variable based on 3 inputs, not just 1 input as you can see in the following code.

Could you please help me with how to change the code?

I also could not understand how the model trained itself because in the model context (pymc3) there is no sign of using X_train or X_test! That is confusing to me.

Thank you all.

import numpy as np
import pandas as pd
import scipy
import theano
import pymc3 as pm


dataset = pd.read_csv('PV-PCM.csv')


from sklearn.model_selection import train_test_split

X=dataset.iloc[:,:-1].values
y=dataset.iloc[:,3].values

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size = 0.2, random_state=42)


with pm.Model() as linear_model:
    # Intercept
    intercept = pm.Normal('Intercept', mu = 0, sd = 10)
    
    # Slope 
    slope = pm.Normal('slope', mu = 0, sd = 10)
    
    # Standard deviation
    sigma = pm.HalfNormal('sigma', sd = 10)
    
    # Estimate of mean
    mean = intercept + slope * dataset.iloc[:,0].values
    
    # Observed values
    Y_obs = pm.Normal('Y_obs', mu = mean, sd = sigma, observed = y)
    
    # Sampler
    step = pm.NUTS()
    
    # Posterior distribution
    linear_trace = pm.sample(1000, step)
    
    pm.traceplot(linear_trace, figsize = (12, 12))
    
    pm.plot_posterior(linear_trace, figsize = (12, 10))
    
    bayes_prediction = linear_trace['Intercept'] + linear_trace['slope'] * "Any number for ONE IV"
1 Like

@OriolAbril :blush: Thanks.

1 Like

To modify your code to include additional predictors, you basically just have to add additional parameters/coefficients/slopes:

    # Slopes
    slope1 = pm.Normal('slope1', mu = 0, sd = 10)
    slope2 = pm.Normal('slope2', mu = 0, sd = 10)
    slope3 = pm.Normal('slope3', mu = 0, sd = 10)

Then use them:

    # Estimate of mean
    mean = (intercept + 
           (slope1 * dataset['x1']) +
           (slope2 * dataset['x2']) +
           (slope3 * dataset['x3']) )

There are more compact ways of specifying this, but it’s hard to go wrong with verbose and explicit.

2 Likes

As for prediction, your code is currently training on all the data. It is not using the train/test split you constructed at the beginning. To do so, you would use the training data instead of the dataset itself.

# Estimate of mean
    mean = intercept + slope * X_train['var1']

and

Y_obs = pm.Normal('Y_obs', mu = mean, sd = sigma, observed = y_train)
2 Likes

Hi, welcome!

The predictors are used in the mean = intercept + slope * dataset.iloc[:,0].values, here you are using the first column of dataset as predictor, doing intercept + slope * x_0. You can define this in any way you want, for a univariate linear regression with multiple predictors, you can use for example

slope = pm.Normal('slope', mu = 0, sd = 10, shape=3)
mean = intercept + slope[0] * dataset.iloc[:,0].values + slope[1] * dataset.iloc[:,1].values + slope[2] * dataset.iloc[:,2].values

in order to use 3 predictors, each with it’s own slope. You could also set different priors for the slopes…

I don’t know much about the context here, but depending on what you eventually plan to work on and which extensions could the model need, I’d recommend taking a look at the pymc3.glm module and/or at bambi.

The predictors are used as I commented above in defining mean, the observed values, y in this case are passed to PyMC3 with the observed keyword, this is done in Y_obs which defines the normal likelhood of the model.

Again, I don’t know what you plan to use the model for, but in general there is no need to split the data into train and test subsets for fitting, you can evaluate the goodness of fit by other means, and overfitting is not generally an issue due to the probabilistic nature of the modeling plus the effect of the priors.

To give a concrete example, you can actually estimate the leave-one-out cross-validation with az.loo, as done in this notebook

2 Likes

Thank you @cluhmann @OriolAbril for your answers. I really appreciate it.

Now, I want to predict the DV with the following code but I get an error:

bayes_prediction = linear_trace['Intercept'] + linear_trace['slope'] * [16,31,35]

[16,31,35] are three random numbers for slope [0], slope [1] and slope [2].
I don’t know how to put three slopes together to get the prediction. I would like to put my X_test values for my 3 IVs.

I appreciate it if you could help me with this as well.

Thank you.

I’m not sure I understand what this is supposed to mean.

Here is a model:

with pm.Model() as linear_model:
    # Intercept
    intercept = pm.Normal('Intercept', mu = 0, sd = 10)
    
    # Slope 
    slope1 = pm.Normal('slope1', mu = 0, sd = 10)
    slope2 = pm.Normal('slope2', mu = 0, sd = 10)
    slope3 = pm.Normal('slope3', mu = 0, sd = 10)
    
    # Standard deviation
    sigma = pm.HalfNormal('sigma', sd = 10)
    
    # Estimate of mean
    mean = (intercept + 
            slope1 * X_train[:,0] +
            slope2 * X_train[:,1] +
            slope3 * X_train[:,2])
    
    # Observed values
    Y_obs = pm.Normal('Y_obs',
                      mu = mean,
                      sd = sigma,
                      observed = y_train)

    # Posterior distribution
    linear_trace = pm.sample()

Now you can now use X_test make predictions about y_test. But how you do this will strongly depend on what exactly you are looking for. Presumably, you are interested in the predictions associated with the individual observations in the test set.

# which observation from the test
# set do we want to inspect?
obs_index = 0

ppc = (linear_trace['Intercept'] + 
      (linear_trace['slope1'] * X_test[obs_index,0]) + 
      (linear_trace['slope2'] * X_test[obs_index,1]) + 
      (linear_trace['slope3'] * X_test[obs_index,2]) )

The variable ppc will contain a full posterior distribution over values of y for the relevant observation (i.e., the value found in y_test[obs_index]). What you do from there depends on your needs.

1 Like

Thank you so much.