Rookie Question On Combining PCA and Bayesian Inference

I believe the PCA model I’ve been running is with only 1 component.

scale_function = lambda x: (x - x.mean()) / x.std()
pca = PCA(n_components=1).fit(data.apply(scale_function))

Does that mean I have 1 factor, or because I have three variables it would be three factors?

Laplace sounds interesting. I don’t know much about it but when I looked at the distributions of my economic series they have quite high peaks and slim bodies so it might be a good match.

Let me clarify my weighting scheme. The idea was to start with a total of 100 economic series. I would then narrow that down to 20 and then with that 20 I would want to apply a weight depending on its explanatory power but I wouldn’t keep the weighting fixed over time. It would adjust month after month according its is explanatory power. If, for example, consumption’s explanatory power on GDP was growing stronger compared to the prior month, the weight would be adjusted to reflect that. I hope that was clear.

My evaluation metric in selecting the economic indicators was using an OLS regression individually one at a time, against GDP. I looked at R squared as well as mean squared error, heteroscedasticity and the p-value.

Here is the regession function I built to run regressions more quickly. Perhaps I’ve made some errors without realizing it.

def regression(x_var, y_var):
    
    df_actual = combine_data_dropna([x_var, y_var])
    X = df_actual.iloc[:,0]
    X = sm.add_constant(X)
    Y = df_actual.iloc[:,1]
    mod = sm.OLS(Y,X)
    res = mod.fit()
    
    df = combine_data_dropna([x_var, y_var])
    split_index = round(len(df)*0.80)
    split_date = df.index[split_index]
    df_train = df[df.index <= split_date].copy()
    df_test = df.loc[df.index > split_date].copy()
    X_train = df_train.iloc[:,0].values
    X_train = sm.add_constant(X_train)
    y_train = df_train.iloc[:,1].values
    X_test = df_test.iloc[:,0].values
    X_test = sm.add_constant(X_test)
    y_test = df_test.iloc[:,1].values
    ols_model = sm.OLS(y_train,X_train)
    ols_results = ols_model.fit()
    y_pred_train = ols_results.predict(X_train)
    y_pred_test = ols_results.predict(X_test)
    
    h_test = het_breuschpagan(res.resid, res.model.exog)
    print(res.summary())
    print('Mean Squared Error: ' + str(res.mse_model.round(3)))
    print('Heteroscedasticity Test: ' + str(h_test[3].round(3)))
    print('P-Value: ' + str(h_test[1].round(3)))

I appreciate the help as you can tell I am out of my depth.