Your input matrix has a really large condition number, which means that many of the columns are likely redundant:
np.linalg.cond(train_norm_retained) # ==> 7.768470541187648e+17
You can try some dimension reduction trick first on train_norm_retained (for example, a PCA or SVD).