Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments...
Transcript of Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments...
1/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Fairwashing in Machine LearningThe risk of rationalization in black-box explanation
Ulrich Aıvodji, Hiromi Arai, Olivier Fortineau,Sebastien Gambs, Satoshi Hara, Alain Tapp
UQAM
2/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Motivations
ML models are becoming ubiquitous
High stakes decision-making systems: medical diagnosis,criminal justice, financeDemand for the design of an ethically-aligned AI
Europe: GDRP Right to an explanationMontreal: Declaration de Montreal pour un developpementresponsable de l’IA
Interpretability by designData → decision tree
Black-box explanation a.k.a. post-hoc explanationDNN → decision tree
This work: We show that a dishonest ML models’ producercan perform fairwashing
Given the false perception that a ML model complies with agiven ethical requirement
Case study: fairness as the ethical requirement to “fairwash”
3/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Motivations
Objective
Raise awareness of fairwashing in machine learning: the risk thatan unfair ML model can be explained in such a way that theunderlying decisions seem fairer than they actually were
3/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Motivations
How?
Show that one can systematically found a fair interpretable modelto rationalize decisions of an unfair black-box model.
4/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
1 Background
2 Problem formulation
3 Fairwashing
4 Experiments
5 Conclusion & Perspectives
5/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Metrics
Fairness: demographic parity
|P(y = 1|s = 1)− P(y = 1|s = 0)|.
Fidelity
fidelity(c) =1
|X |∑x∈X
I(c(x) = b(x)).
6/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Rule list
A rule list d = (dp, δp, q0,K ) of length K ≥ 0 is a (K + 1)−tupleconsisting of K distinct association rules rk = pk → qk , wherepk ∈ dp is the antecedent of the association rule and qk ∈ δp itscorresponding consequent, followed by a default prediction q0.
Example of rule list for salary prediction
IF occupation:white-collar THEN income:≥ 50k
ELSE IF occupation:professional THEN income:≥ 50k
ELSE IF education:bachelors THEN income:≥ 50k
ELSE income:< 50k
7/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Learning optimal rule lists
CORELS (Angelino et al., 2017)
Input: n categorical attribute + binary labels
Output: optimal rule list
Supervised learning algorithm
Represent the search space as a n-level trie
Objective function: R(d , x , y) = misc(d , x , y) + λK
Select the rule list that minimize R(d , x , y)
Use an efficient branch-and-bound algorithm to prune the trie
8/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Enumerating rule lists
Model Enumeration (Satoshi Hara & Masakazu Ishihata, 2018)
Enumerate rule lists in a descending order of the objective functionby calculating successively the optimal rule list using CORELS,and then constructing sub-problems excluding the solutionobtained.
9/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Model rationalization
Given a black-box model b, a set of instances X , and a sensitiveattribute s, find a global interpretable model cg = f (b,X ) derivedfrom b and X , using some process f (·, ·), such thatε(cg ,X , s) > ε(b,X , s), for some fairness metric ε(·, ·, ·).
10/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Outcome rationalization
Given a black-box model b, an instance x , its neighborhood V(x),and a sensitive attribute s, find a local interpretable modelcl = f (b, x) derived from b and V(x), using some process f (·, ·),such that ε(cl ,V(x), s) > ε(b,V(x), s), for some fairness metricε(·, ·, ·).
11/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Better call LaundryML
Explores the search space of rule lists with a modified versionof CORELS
New objective function:obj(d , x , y) = (1− β)misc(d , x , y) +βunfairness(d , x , y) +λK
Enumerate rule lists
Select fair rule lists that have higher fidelity
12/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
LaundryML
13/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Setup
Data & black-box models
Data: Adult Income (resp. ProPublica Recidivism)
Sensitive attribute: gender (resp. race)
Black-box models: random forests
Unfairness of the black-box models: 0.13 (resp. 0.17)
Search space: 28! (resp. 27!)
50 models enumerated per experiment
Evaluation metrics
Unfairness
Fidelity
Feature importance via FairMl
14/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Model rationalization – Unfairness and Fidelity
●●●●●●
●●● ●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●
●●●●●●● ●●● ●
●●●●●●●●●●●●●●●●●●
●●●●●●●●●●●●●●●●●●●●●●
●●●●
●●●●
●●●●●
●●●●
●●● ●● ●●● ●●●●●●●●● ●●●●● ●●●●●
●●●
● ●●●●
●●●
●●●●●●●●●●●
●●●●●●●
●●●●●● ●● ●●●
●●
●●●●●●●
●●
●●●●●
Adult Income
λ=0.005
Adult Income
λ=0.01
ProPublica Recidivism
λ=0.005
ProPublica Recidivism
λ=0.01
0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2
0.6
0.7
0.8
0.9
Unfairness
Fid
elity
β
●●● 0
0.1
0.2
0.5
0.7
0.9
Figure: Model rationalization for Adult Income and ProPublicaRecidivism.
Best rationalization models
Adult Income: fidelity = 0.908, unfairness =0.058.
ProPublica Recidivism: fidelity = 0.748, unfairness=0.080.
15/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Model rationalization – Unfairness and Fidelity tradeoffs
Figure: Fidelity/fairness tradeoffs on Adult Income.
16/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Model rationalization – Unfairness and Fidelity tradeoffs
Figure: Fidelity/fairness tradeoffs on ProPublica Recidivism.
17/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Model rationalization – Feature importance
Figure: Feature importance Black-box vs Best rationalization model onAdult Income
18/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Model rationalization – Feature importance
Figure: Feature importance Black-box vs Best rationalization model onProPublica Recidivism
19/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Outcome rationalization
0.00
0.25
0.50
0.75
1.00
0.0 0.1 0.2 0.3
Unfairness
Pro
port
ion
of u
sers
black−box
β=0.1
β=0.3
β=0.5
β=0.7
β=0.9
0.00
0.25
0.50
0.75
1.00
0.05 0.06 0.07 0.08 0.09
Unfairness
Pro
port
ion
of u
sers
black−box
β=0.1
β=0.3
β=0.5
β=0.7
β=0.9
Figure: Outcome rationalization. Adult Income (left), ProPublicaRecidivism (right).
20/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Generalization to other fairness metrics (1/3)
0.00
0.25
0.50
0.75
1.00
0.05 0.10 0.15
Unfairness
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Fidelity
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.9
Figure: Model rationalization. Adult Income, Random forest, OverallAccuracy Equality.
21/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Generalization to other fairness metrics (2/3)
0.00
0.25
0.50
0.75
1.00
0.0 0.1 0.2 0.3 0.4
Unfairness
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75 1.00
Fidelity
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
Figure: Model rationalization. Adult Income, Random forest, ConditionalProcedure Accuracy.
22/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Generalization to other fairness metrics (3/3)
0.00
0.25
0.50
0.75
1.00
0.00 0.05 0.10 0.15
Unfairness
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75
Fidelity
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
Figure: Model rationalization. Adult Income, Random forest,Demographic parity.
23/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Generalization to other black-box models (1/3)
0.00
0.25
0.50
0.75
1.00
0.00 0.05 0.10
Unfairness
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
0.00
0.25
0.50
0.75
1.00
0.0 0.2 0.4 0.6 0.8
Fidelity
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
Figure: Model rationalization. Adult Income, SVM, Demographic parity.
24/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Generalization to other black-box models (2/3)
0.00
0.25
0.50
0.75
1.00
0.00 0.05 0.10 0.15
Unfairness
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75
Fidelity
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
Figure: Model rationalization. Adult Income, XGBOOST, Demographicparity.
25/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Generalization to other black-box models (3/3)
0.00
0.25
0.50
0.75
1.00
0.00 0.05 0.10 0.15 0.20
Unfairness
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
0.00
0.25
0.50
0.75
1.00
0.00 0.25 0.50 0.75
Fidelity
Pro
port
ion
of m
odel
s
β=0.0
β=0.1
β=0.2
β=0.5
β=0.7
β=0.9
Figure: Model rationalization. Adult Income, MLP, Demographic parity.
26/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Conclusion
LaundryMl: black-box explanations can be used to rationalizeunfair decisions of a black-box model
Can we trust black-box explanations?
27/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Perspectives
Detecting fairwashing
Study the root cause: robustness of explanations
Learn more
Our work: Fairwashing: the risk of rationalization. ICML’19
Another approach: Pretending Fair Decisions via StealthilyBiased Sampling. arXiv:1901.08291, 2019
Blog post on post rationalization: Interpretability andPost-Rationalization
28/28
Background Problem formulation Fairwashing Experiments Conclusion & Perspectives
Thank you!