Bayesian linear regression

monika · October 30, 2019, 12:30pm

I have built a bayesian linear regression model using pymc3. On evalation, I found that the MAPE results (mean absolute percentage error) on training set is more the test set. what can be the reason for this as the same variable means are used for test set?

Mind-The-Data · October 30, 2019, 5:11pm

This is hard to answer without more information. Have you tried other error metrics like RMSE, that aren’t scaled as a percentage? Have you tried randomly selecting a new training/test split? If n is small you might of just got lucky and happened to have a test set that was predicted well by the linear model. This isn’t impossible in theory, just unlucky as n --> larger. If n is large and RMSE also shows better results on training and test and a new random test/training split it might just be model misspecification.

monika · October 31, 2019, 6:12am

Thanks. I checked MAE_train, MAE_test, which are 0.0523275, 0.0571307 respectively and RMSE_train and RMSE_test which are 0.066204, 0.0608991 respectively. The test samples are very less, so I should do cross validation and check the results.

Mind-The-Data · October 31, 2019, 1:11pm

Hmm. I would try another random splitting of train and test. And rerun the model.

Topic		Replies	Views
Prediction by Bayesian linear regression Questions	7	1884	April 12, 2021
Predictions and evaluation of linear regression Questions	11	1897	November 22, 2018
Comparing posterior predictive errors across samples Questions	0	412	October 7, 2020
Do we need a testing set? Questions	9	5893	May 27, 2020
After fitting the model the R2 for test data is negative	9	324	May 30, 2025

Bayesian linear regression

Related topics