Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments...

29
1/28 Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization in black-box explanation Ulrich A¨ ıvodji, Hiromi Arai, Olivier Fortineau, ebastien Gambs, Satoshi Hara, Alain Tapp UQ ` AM

Transcript of Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments...

Page 1: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

1/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Fairwashing in Machine LearningThe risk of rationalization in black-box explanation

Ulrich Aıvodji, Hiromi Arai, Olivier Fortineau,Sebastien Gambs, Satoshi Hara, Alain Tapp

UQAM

Page 2: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

2/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Motivations

ML models are becoming ubiquitous

High stakes decision-making systems: medical diagnosis,criminal justice, financeDemand for the design of an ethically-aligned AI

Europe: GDRP Right to an explanationMontreal: Declaration de Montreal pour un developpementresponsable de l’IA

Interpretability by designData → decision tree

Black-box explanation a.k.a. post-hoc explanationDNN → decision tree

This work: We show that a dishonest ML models’ producercan perform fairwashing

Given the false perception that a ML model complies with agiven ethical requirement

Case study: fairness as the ethical requirement to “fairwash”

Page 3: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

3/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Motivations

Objective

Raise awareness of fairwashing in machine learning: the risk thatan unfair ML model can be explained in such a way that theunderlying decisions seem fairer than they actually were

Page 4: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

3/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Motivations

How?

Show that one can systematically found a fair interpretable modelto rationalize decisions of an unfair black-box model.

Page 5: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

4/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

1 Background

2 Problem formulation

3 Fairwashing

4 Experiments

5 Conclusion & Perspectives

Page 6: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

5/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Metrics

Fairness: demographic parity

|P(y = 1|s = 1)− P(y = 1|s = 0)|.

Fidelity

fidelity(c) =1

|X |∑x∈X

I(c(x) = b(x)).

Page 7: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

6/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Rule list

A rule list d = (dp, δp, q0,K ) of length K ≥ 0 is a (K + 1)−tupleconsisting of K distinct association rules rk = pk → qk , wherepk ∈ dp is the antecedent of the association rule and qk ∈ δp itscorresponding consequent, followed by a default prediction q0.

Example of rule list for salary prediction

IF occupation:white-collar THEN income:≥ 50k

ELSE IF occupation:professional THEN income:≥ 50k

ELSE IF education:bachelors THEN income:≥ 50k

ELSE income:< 50k

Page 8: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

7/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Learning optimal rule lists

CORELS (Angelino et al., 2017)

Input: n categorical attribute + binary labels

Output: optimal rule list

Supervised learning algorithm

Represent the search space as a n-level trie

Objective function: R(d , x , y) = misc(d , x , y) + λK

Select the rule list that minimize R(d , x , y)

Use an efficient branch-and-bound algorithm to prune the trie

Page 9: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

8/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Enumerating rule lists

Model Enumeration (Satoshi Hara & Masakazu Ishihata, 2018)

Enumerate rule lists in a descending order of the objective functionby calculating successively the optimal rule list using CORELS,and then constructing sub-problems excluding the solutionobtained.

Page 10: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

9/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Model rationalization

Given a black-box model b, a set of instances X , and a sensitiveattribute s, find a global interpretable model cg = f (b,X ) derivedfrom b and X , using some process f (·, ·), such thatε(cg ,X , s) > ε(b,X , s), for some fairness metric ε(·, ·, ·).

Page 11: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

10/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Outcome rationalization

Given a black-box model b, an instance x , its neighborhood V(x),and a sensitive attribute s, find a local interpretable modelcl = f (b, x) derived from b and V(x), using some process f (·, ·),such that ε(cl ,V(x), s) > ε(b,V(x), s), for some fairness metricε(·, ·, ·).

Page 12: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

11/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Better call LaundryML

Explores the search space of rule lists with a modified versionof CORELS

New objective function:obj(d , x , y) = (1− β)misc(d , x , y) +βunfairness(d , x , y) +λK

Enumerate rule lists

Select fair rule lists that have higher fidelity

Page 13: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

12/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

LaundryML

Page 14: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

13/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Setup

Data & black-box models

Data: Adult Income (resp. ProPublica Recidivism)

Sensitive attribute: gender (resp. race)

Black-box models: random forests

Unfairness of the black-box models: 0.13 (resp. 0.17)

Search space: 28! (resp. 27!)

50 models enumerated per experiment

Evaluation metrics

Unfairness

Fidelity

Feature importance via FairMl

Page 15: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

14/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Model rationalization – Unfairness and Fidelity

●●●●●●

●●● ●●● ●●●●●●●●●●●●●●● ●●●●●●●●●●●●●●●●●●●●●●

●●●●●●● ●●● ●

●●●●●●●●●●●●●●●●●●

●●●●●●●●●●●●●●●●●●●●●●

●●●●

●●●●

●●●●●

●●●●

●●● ●● ●●● ●●●●●●●●● ●●●●● ●●●●●

●●●

● ●●●●

●●●

●●●●●●●●●●●

●●●●●●●

●●●●●● ●● ●●●

●●

●●●●●●●

●●

●●●●●

Adult Income

λ=0.005

Adult Income

λ=0.01

ProPublica Recidivism

λ=0.005

ProPublica Recidivism

λ=0.01

0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2 0.0 0.1 0.2

0.6

0.7

0.8

0.9

Unfairness

Fid

elity

β

●●● 0

0.1

0.2

0.5

0.7

0.9

Figure: Model rationalization for Adult Income and ProPublicaRecidivism.

Best rationalization models

Adult Income: fidelity = 0.908, unfairness =0.058.

ProPublica Recidivism: fidelity = 0.748, unfairness=0.080.

Page 16: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

15/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Model rationalization – Unfairness and Fidelity tradeoffs

Figure: Fidelity/fairness tradeoffs on Adult Income.

Page 17: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

16/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Model rationalization – Unfairness and Fidelity tradeoffs

Figure: Fidelity/fairness tradeoffs on ProPublica Recidivism.

Page 18: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

17/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Model rationalization – Feature importance

Figure: Feature importance Black-box vs Best rationalization model onAdult Income

Page 19: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

18/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Model rationalization – Feature importance

Figure: Feature importance Black-box vs Best rationalization model onProPublica Recidivism

Page 20: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

19/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Outcome rationalization

0.00

0.25

0.50

0.75

1.00

0.0 0.1 0.2 0.3

Unfairness

Pro

port

ion

of u

sers

black−box

β=0.1

β=0.3

β=0.5

β=0.7

β=0.9

0.00

0.25

0.50

0.75

1.00

0.05 0.06 0.07 0.08 0.09

Unfairness

Pro

port

ion

of u

sers

black−box

β=0.1

β=0.3

β=0.5

β=0.7

β=0.9

Figure: Outcome rationalization. Adult Income (left), ProPublicaRecidivism (right).

Page 21: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

20/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Generalization to other fairness metrics (1/3)

0.00

0.25

0.50

0.75

1.00

0.05 0.10 0.15

Unfairness

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

Fidelity

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.9

Figure: Model rationalization. Adult Income, Random forest, OverallAccuracy Equality.

Page 22: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

21/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Generalization to other fairness metrics (2/3)

0.00

0.25

0.50

0.75

1.00

0.0 0.1 0.2 0.3 0.4

Unfairness

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75 1.00

Fidelity

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

Figure: Model rationalization. Adult Income, Random forest, ConditionalProcedure Accuracy.

Page 23: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

22/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Generalization to other fairness metrics (3/3)

0.00

0.25

0.50

0.75

1.00

0.00 0.05 0.10 0.15

Unfairness

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75

Fidelity

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

Figure: Model rationalization. Adult Income, Random forest,Demographic parity.

Page 24: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

23/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Generalization to other black-box models (1/3)

0.00

0.25

0.50

0.75

1.00

0.00 0.05 0.10

Unfairness

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

0.00

0.25

0.50

0.75

1.00

0.0 0.2 0.4 0.6 0.8

Fidelity

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

Figure: Model rationalization. Adult Income, SVM, Demographic parity.

Page 25: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

24/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Generalization to other black-box models (2/3)

0.00

0.25

0.50

0.75

1.00

0.00 0.05 0.10 0.15

Unfairness

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75

Fidelity

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

Figure: Model rationalization. Adult Income, XGBOOST, Demographicparity.

Page 26: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

25/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Generalization to other black-box models (3/3)

0.00

0.25

0.50

0.75

1.00

0.00 0.05 0.10 0.15 0.20

Unfairness

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

0.00

0.25

0.50

0.75

1.00

0.00 0.25 0.50 0.75

Fidelity

Pro

port

ion

of m

odel

s

β=0.0

β=0.1

β=0.2

β=0.5

β=0.7

β=0.9

Figure: Model rationalization. Adult Income, MLP, Demographic parity.

Page 27: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

26/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Conclusion

LaundryMl: black-box explanations can be used to rationalizeunfair decisions of a black-box model

Can we trust black-box explanations?

Page 28: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

27/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Perspectives

Detecting fairwashing

Study the root cause: robustness of explanations

Learn more

Our work: Fairwashing: the risk of rationalization. ICML’19

Another approach: Pretending Fair Decisions via StealthilyBiased Sampling. arXiv:1901.08291, 2019

Blog post on post rationalization: Interpretability andPost-Rationalization

Page 29: Fairwashing in Machine Learning€¦ · Background Problem formulation Fairwashing Experiments Conclusion & Perspectives Fairwashing in Machine Learning The risk of rationalization

28/28

Background Problem formulation Fairwashing Experiments Conclusion & Perspectives

Thank you!