Explainability of pymc predictions: which features and which direction

I’ve developed a pymc model that is fairly accurate on the probability predictions given by pm.sample_posterior_predictive(idata). However I need a hint on explainability: How can I return back to the business actionable suggestions based on the features input to the model?

To better communicate the question I have run the same process on titanic survivor data, and can see a few important features from az.plot_trace() which are centered to the left and right of 0. I conclude that these two features are important features to the predictions (shown here, it’s gender_classo and boato). How/what function can I use to say to the business in actual number ranges, “when boato is x value, that passenger is x% more likely to survive, all other features being equal?” I’ve tried finding the mean and std of these plots, and seems plausible - am I on the right track? Do I need the log of the mean?

It sounds like you are going to want to generate some posterior predictive samples that are conditional on new (hand selected) values of your predictors (e.g., boato). To do so, it might be easiest to wrap your data in a pm.Data container and then pm.observe() new values (e.g., the particular values of boato that you want to present) and then generate some posterior predictive samples (that are now conditional on the newly observed data). Here is a notebook that might help.

1 Like

You might also find arviz.plot_posterior which has a ref_val argument to generate a plot closer to what you want for this task, as opposed to plot_trace being more geared toward convergence diagnosis.