I have the following model for a logistic regression where all predictors are following a ~N(0,1)
distribution:
with pm.Model(coords={"predictors":X_binary.columns.values}) as model_1:
X = pm.MutableData('X', X_train)
y = pm.MutableData('y', y_train)
constant = pm.Normal('constant', mu=-0.5, sigma=0.1)
beta = pm.Normal('beta', mu=0, sigma=1, dims="predictors")
score = pm.Deterministic('score', X@beta)
Now, for some given predictors, I would like to give them a ~N(1,0.5)
distribution. I tried the following:
list_of_bad = ['remove-update', 'invoke-ssidexfil','-payload']
list_of_normal = [pred for pred in X_binary.columns.values
if pred not in list_of_bad]
with pm.Model(coords={"normal_pred":list_of_normal, "bad_pred":list_of_bad}) as model_2:
X = pm.MutableData('X', X_train)
y = pm.MutableData('y', y_train)
constant = pm.Normal('constant', mu=-0.5, sigma=0.1)
beta_normal = pm.Normal('beta_normal', mu=0, sigma=1, dims='normal_pred')
beta_bad = pm.Normal('beta_bad', mu=1, sigma=0.5, dims='bad_pred')
beta = pm.math.concatenate([beta_normal,beta_bad])
score = pm.Deterministic('score', X@beta)
But that didn’t seem to work, I get a wacky result. It could be because concatenating will change the order of my features and they won’t coincide with the order in X
anymore…? But I’m not sure…
Thanks in advance!