When you have a simple linear regression, you can look at these parameter values and make statements about the effect of X on y; specifically, “a unit change in X causes a \beta unit change in y”. This is all tied to the partial derivative \frac{\partial y}{\partial X}. Since you have a model of the form y = X\beta, clearly this derivative is just \beta.
In logistic regression (which I guess you’re doing?), you have to deal with the non-linear effect of the link function. The effect of X of y \frac{\partial y}{\partial X} under the model y = \text{Logit}^{-1}(X\beta) will not be just \beta. The inverse logit function is \text{Logit}^{-1}(x) = \frac{1}{1 + exp(-x)}. From the presence of the exponential you can guess that the effect of a unit shift in X is now going to depend on the level of X. The “traditional” way to talk about impacts here to talk about log-odds – see here for an explaination. The short version is the the logit function (not the inverse!) is equal to the log odds, so if you take logit on both sides of the model, e/g. \text{Logit(y)} = \text{Logit}(\text{Logit}^{-1}(X\beta)) \implies \text{Logit}(y) = X\beta \implies \ln \left ( \frac{y}{1 - y} \right ) = X \beta . So a unit change in X can be interpreted linearly with respect to the log-odds of survival. You could say in your example, that, holding all else equal, being assigned a life-boat increases the log-odds of survival by 0.85, or the odds of survival by exp(0.85) = 2.3. So a passenger assigned a lifeboat had a 130% better shot at making it than one without.
You will be forgiven for thinking this all sucks. For one thing, I discarded all the posterior uncertainty. For another, you will have to work out how to reason about these partial derivatives every time you work with a different model. Instead, you can run simulations using pm.set_data
. To illustrate, I’ll make a similar logistic regression model for titanic survivors:
import pymc as pm
from sklearn import datasets
import numpy as np
import matplotlib.pyplot as plt
import arviz as az
data = datasets.fetch_openml(name='titanic', version=1)
X = data['data']
y = data['target']
X['last_name'] = X.name.str.split(',').str[0].str.strip()
X['first_name'] = X.name.str.split(',').str[1:].str.join(',').str.strip()
X['honorific'] = X.first_name.str.split('.').str[0].str.strip()
normal_people = ['Miss', 'Mr', 'Mrs', 'Mme', 'Ms']
X['has_title'] = ~X.honorific.isin(normal_people)
X['first_class'] = X.pclass == 1
X['second_class'] = X.pclass == 2
X['female'] = X['sex'] == 'female'
X['log_fare'] = X.fare.apply(np.log).replace(-np.inf, np.nan).fillna(0)
X['no_boat'] = X.boat.isna()
features = ['first_class', 'second_class', 'has_title', 'female', 'log_fare', 'no_boat']
coords = {'feature': features,
'obs_idx': X.index.values}
with pm.Model(coords=coords) as model:
X_pt = pm.Data('X', X[features].astype(float), dims=['obs_idx', 'feature'])
y_pt = pm.Data('y', y.astype(int), dims=['obs_idx'])
alpha = pm.Normal('alpha', 0, 1)
beta = pm.Normal('beta', 0, 3, dims=['feature'])
logit_p = alpha + X_pt @ beta
y_hat = pm.Bernoulli('y_hat', logit_p=logit_p, observed=y_pt, dims=['obs_idx'])
idata = pm.sample()
idata = pm.sample_posterior_predictive(idata, extend_inferencedata=True)
My estimates are somewhat different than yours, which I guess reflects that I don’t know what your variables mean. But that’s not really the point of the exercise, so it doesn’t matter.
We can see what predictions our model would make if everyone had no_boat = 1.0
or everyone had no_boat = 0.0
. Comparing these will give insight into the effect of the no_boat
variable on the model. To do this, use pm.set_data
with pm.posterior_predictive
:
with model:
pm.set_data({'X': X[features].assign(no_boat = 1.0)})
idata_no_boat = pm.sample_posterior_predictive(idata, predictions=True, extend_inferencedata=False)
pm.set_data({'X': X[features].assign(no_boat = 0.0)})
idata_boat = pm.sample_posterior_predictive(idata, predictions=True, extend_inferencedata=False)
fig, ax = plt.subplots(1, 3, figsize=(14, 4))
az.plot_posterior(idata_no_boat.predictions, combine_dims={'obs_idx'}, ax=ax[0])
az.plot_posterior(idata_boat.predictions, combine_dims={'obs_idx'}, ax=ax[1])
az.plot_posterior(idata.posterior_predictive, combine_dims={'obs_idx'}, ax=ax[2])
ax[0].set_title('Nobody gets a boat')
ax[1].set_title('Everyone gets a boat')
ax[2].set_title('Observed')
plt.show()
It would be more direct to directly contrast the two outputs, by instead plotting the difference idata_boat - idata_no_boat
:
az.plot_posterior(idata_boat.predictions - idata_no_boat.predictions, combine_dims={'obs_idx'})
So across the whole dataset, holding other factors equal, having a boat raises the chances of survival by about 88%.