Priors of Great Potential - How You Can Add Fairness Constraints to Models Using Priors by Vincent D. Warmerdam & Matthijs Brouns

I’ll try to respond to some of the many topics here.

I am not a stats person but I had always pictured estimation as always trying to capture reality as best as possible.

Models, by definition, are different from reality. I can come up with many timeseries models that don’t reflect reality at all. You can approximate the seasonal effect of ice-cream sales with a taylor series despite the reality of ice-cream having nothing to do with it. A model doesn’t have to reflect reality in order to be useful.

but I think this use of priors is only for when we have an algorithm that is making decisions for users and not so much for informing users to make decisions?

It depends “who” the user is. If the user is more of an analyst person who wants to learn from the coefficients of the model then this is different than if the user is the receiving end of the models prediction. In the case of the talk we’re indeed more concerned about the latter.

I can imagine circumstances where constraining for fairness in estimation could hide it in the data?

Could you give an example? Before you can apply these debiasing tricks you need to have data on sensitive attributes. That suggests that the act of finding constraints is also the act of making bias less hidden. Also: I’m more concerned with bias hidden in the model actually. If the bias remains in the data that’s to some extend “fine” if we can guarantee it’s not in the model.

Would it be possible to get your model to include an interesting predictor, like investment in different fairness initiatives, highlight, rank etc the area’s of most unfairness and then give us advice on the best bang for buck interventions against all those, for example? Or is there another good use case/mode I’m ignoring?

What you’re suggesting here sounds like a proper comparative study. I understand that projects like fairlearn intend to investigate this too but most of the studies that I hear from seem to be academic. I’m personally more interested in use-cases adopted by industry.

As far as mitigation techniques go, both Matthijs and myself have spoken about this if you’re interested in more background material. There’s a pretty wide array of techniques we’ve open sourced in scikit-lego that work differently than what we propose here.

3 Likes