My greetings to the community,
I’m developing a softmax regression model using PyMC v5.9.0.
My training data are as following:
xObservedScaled
: 124 observations x 102 features
Nclasses
: 3
From the 102 features, 2 features (namely age and bmi) were incorporated in order to be controlled for.
The model I used was the following:
with pm.Model() as model:
alpha = pm.Normal('alpha', mu=0, sigma=1, shape=Nclasses)
beta = pm.Normal('beta', mu=0, sigma=0.5, shape=(Nfeatures,Nclasses))
X = pm.MutableData("X", xObservedScaled)
mu = alpha + pm.math.dot(X, beta)
theta = pm.Deterministic('theta', pt.special.softmax(mu, axis=1))
yhat = pm.Categorical('yhat', p=theta, observed=yObserved)
idata = pm.sample(2000)
Based on the documentation, in order to make predictions on new data, I should use the pm.set_data()
, that is the following code:
with model:
pm.set_data({"X": xNewScaled})
predictions = pm.sample_posterior_predictive(idata, model=model, predictions=True, var_names=['theta'])
However, in order for the above code to work, xNewScaled
should have the save number of features (102) as the xObservedScaled
. In my case, I want to make predictions without the controlled variables (age and bmi). How can I achieve this?
Thank you in advance for any help.