This is hard to answer without more information. Have you tried other error metrics like RMSE, that aren’t scaled as a percentage? Have you tried randomly selecting a new training/test split? If n is small you might of just got lucky and happened to have a test set that was predicted well by the linear model. This isn’t impossible in theory, just unlucky as n --> larger. If n is large and RMSE also shows better results on training and test and a new random test/training split it might just be model misspecification.