I am implementing (what I think is) a straightforward logistic regression, trying to predict a binary group label from a series of predictors. The predictors are the result of a principal components analysis retaining PC’s explaining 90% of the variance from set of image pixels (256 * 256 images of faces) of 1,333 faces, with the final design matrix being 1333 * 148.
The PC’s are all centred and scaled (though this is maybe not necessary), but the sampler runs very slowly and does not converge, ending with several thousand divergences. They seem to kick in around halfway through the sampling. I can’t seem to get it converge even with very constrained priors on the coefficients, which is about the extent of my knowledge in fixing divergences.
Minimally reproducible example below:
# Just to show the PC and scaling procedure with sklearn #
#scaler = Pipeline([('scaler', StandardScaler()), ('PCA', PCA(n_components=.90)), ('rescaler', StandardScaler()])
# Scaling applied here to large pixel array #
# preds = scaler.fit_transform(pixel_arr)
import pandas as pd
import pymc3 as pm
# Read data
preds = pd.read_csv('https://osf.io/s53vm/download').set_index('Unnamed: 0')
label_data = pd.read_csv('https://osf.io/pe2n7/download').set_index('Unnamed: 0')
labels = label_data.squeeze()
# Set up model
with pm.Model() as classifier1:
# Set data (for holdout classification later) Xdata = pm.Data('Xdata', preds) ydata = pm.Data('ydata', labels) # Priors β = pm.Normal('coefs', 0, 1, shape=Xdata.dshape) β0 = pm.Normal('intercept', 0, 1) # Linear combination p = pm.math.invlogit(β0 + pm.math.dot(Xdata, β)) proba = pm.Deterministic('proba', p) # Likelihood _ = pm.Bernoulli('likelihood', p=p, observed=ydata) # Inference button idata = pm.sample(1000, cores=8, return_inferencedata=True)
I am sure this is relatively simple but I can’t seem to stop making the sampler angry! Any help is greatly appreciated. Thank you!