1 Contributions du CRIL dans le cadre de lACI Daddi Présenté par : S. Benferhat...

1

Contributions du CRILdans le cadre de l’ACI Daddi

Présenté par :

S. Benferhat

[email protected]

2

Hybridation, évaluation, imprécision et détection anticipée d’intrusion

Catégorie(connexion)

Arbre de décision

Réseaux bayésien

s(TAN, etc)

Traitement des

vrais/faux positifs

Traitement des

vrais/faux négatifs

Connexion en cours

Système 1 : Apprentissage et inférence sous informations incomplètes

Système 2(connaissances

expertes)

Evaluation

3

Partie 1: Hybridation dans la détection anticipée d’intrusion

Salem BENFERHAT et Karim TABIA

4

Point de départ: résultats de AD et NaiveBayes

AD NaïveBayes

PCC 92.06% 91.47%

Normal 99.50% 97.68%

DoS 97.24% 96.65%

R2L 0.52% 8.66%

U2R 13.60% 11.84%

Probe 77.92% 88.33%

5

Conclusions des résultats de AD et NB pour KDD’99

Aucune technique n’est meilleure dans les quatre catégories d’attaques

Toutes les techniques sont faibles dans la détection des attaques rares en générale, R2L et U2R en particulier

Alternative : Exploiter la complémentarité

6

Problèmes de détection d’intrusions sur KDD’99

Problèmes liés à la bases de données KDD’99

Problèmes liés aux algorithmes utilisés

7

Comment exploiter la complémentarité des techniques d’apprentissage automatique/classification pour améliorer le taux de détection (et améliorer spécialement la détection des attaques R2L et U2R?)

Problématique

8

Méta-calssificate

urNB-AD

Connexion c

NaïveBayes

NB (c)

AD

Catégorie de la connexion

Un méta-classificateur simple

AD (c)

NB-AD (c)

NB(c) est la prédiction rendue par NaiveBayes pour la connexion cIdem pour AD et NB-AD

9

Un méta-classificateur simple

. Pas de problème si NB(c) et AD(c) prédisent la même classe

. Si NB(c) et AD(c) prédisent deux classes différentes:

- Utiliser le pcc (par rapport a chaque classe) :NB-AD (c) =ArgMax(pcc(AD, c) , pcc(NB, c))

Ou encore- Faire la moyenne sur les distributions de probabilités

associées aux classes :NB-AD (c) =ArgMax k=0..39 {1/2 [prob(AD, c,k) + prob(NB, c, k)])

Problème: Peu d’améliorations!

10

la majorité des erreurs de classification sont des faux négatifs

Gérer différemment les prédictions de NaïveBayes et AD pour traiter les: vrais/faux négatifs vrais/faux positifs

Principe d’une approche hybride

11

Principe d’une hybride

Traitement des vrais/faux positifs (la classe prédite par les deux classificateurs n’est pas la classe normale)

Traitement des vrais/faux négatifs (au moins, l’un des classificateurs a prédit la classe normale)

- Confirmation des prédictions de la classe normale ou

- utiliser les connaissances expertes pour classifier l’attaque

12

Traitementdes vrais/faux positifs

Si (AD(c)≠ Normal) ou (NaïveBayes(c) ≠ Normal) alors Si AD(c) = R2L ou NaïveBayes(c) = R2L alors Meta-NB-AD(c) :=R2L ; Sinon

Si AD(c) = U2R ou NaïveBayes(c) = U2R alors Meta-NB-AD(c) :=U2R ;Sinon

Meta-NB-AD(c) = ArgMaxk=0..39 {1/2 [prob(AD, c,k) + prob(NB, c, k)])

Fin si ;

Fin si ;

13

Confirmer ou corriger les prédictions de la classe normale nécessite de recourir à:

• l’approche comportementale pour distinguer entre vrais et faux négatifs

• un mécanisme basé sur « les connaissances expertes » permettant d’identifier la catégorie d’attaques des connexions reconnues comme faux négatifs

Principes du traitement des vrais/faux négatif

14

Schéma général pour le traitement des vrais/faux négatifs

vrai/faux négatif ?

vrai négatif ?

Catégorie(c) :=Normal Fausse alerte ?

Identification de la catégorie d’attaque du faux négatif.

Catégorie(c) :=Normal

Oui

Oui

Non

Non

NaïveBayes(c)=Normal

AD(c)=Normal

Procédure de distinction entre vrais et faux négatifs

Connaissance expertesCompor-tementale

15

Modélisation des connexions normales dans les données d’apprentissage

Élaboration d’une mesure de similarité pour juger le degré de normalité d’une connexion (similarité avec le modèle des connexions normales)

Vérifier si les connexions reconnues anormales ne constituent-elles pas des fausses alertes

Distinction entre vrais/faux négatifs1ère étape : approche comportementale

16

Modélisation des connexions normales

• Les attributs numériques sont modélisés par deux grandeurs : la moyenne et l’écart type.

• Les attributs logiques sont modélisés par les fréquences respectives des valeurs 0 et 1.

• Les attributs symboliques sont modélisés par la fréquence de chaque valeur.

17

Vecteur de normalité

Attributs numériques Moyenne Écart type

duration 216.66 1359.22

src_bytes 1157.056 34226.301

dst_bytes 3384.668 37578.391

wrong_fragment 0 0

urgent 0 0.01

hot 0.045 0.858

num_failed_logins 0 0.021

num_compromised 0.029 4.047

num_root 0.056 4.53

num_file_creations 0.005 0.203

num_shells 0 0.021

num_access_files 0.005 0.081

num_outbound_cmds 0 0

count 8.163 17.712

18

srv_count 10.936 21.804

serror_rate 0.002 0.028

srv_serror_rate 0.002 0.027

rerror_rate 0.056 0.229

srv_rerror_rate 0.056 0.228

same_srv_rate 0.986 0.092

diff_srv_rate 0.018 0.117

srv_diff_host_rate 0.134 0.278

dst_host_count 148.514 103.396

dst_host_srv_count 202.066 86.913

dst_host_same_srv_rate 0.845 0.305

dst_host_diff_srv_rate 0.056 0.18

dst_host_same_src_port_rate 0.134 0.281

dst_host_srv_diff_host_rate 0.024 0.05

dst_host_serror_rate 0.002 0.029

dst_host_srv_serror_rate 0.001 0.016

dst_host_rerror_rate 0.058 0.225

dst_host_srv_rerror_rate 0.056 0.219

19

Attributs logiques 0 1

land 100.00% 0.00%

logged_in 28.10% 71.90%

root_shell 99.98% 0.02%

su_attempted 99.99% 0.01%

is_host_login 100.00% 0.00%

is_guest_login 99.62% 0.38%

20

Mesure de distance d’une connexion avec le modèle des connexions normales

Si ai est continu : plus ai s’écarte de la moyenne, moins la connexion est normale (% l’attribut i):

•Si ai est discret ou symbolique : moins la valeur de ai est fréquente, moins la connexion est normale (% l’attribut i). En particulier, toute nouvelle valeur donne une distance maximale (1).

21

Mesure de distance d’une connexion avec le modèle des connexions normales

Décider si la connexion représentée par le vecteur d’attributs a est normale ou anormaleSi Dist(c, reference) < α alors

c est normale ;Sinon

c est anormale ;Fin si

Avec :Dist(c, reference) = g (Dist(ai,âi)) (i=0,40)

. g=moyenne pondérée (retenue dans l’expérimentation)

. g=max. Une connexion est anormale dès que une nouvelle valeur apparaît (détection de nouvelles attaques)

22



vrai négatif ?




Oui

Oui

Non

Non


AD(c)=Normal



23

Traitement des vrais/faux négatifs:2ème étape : introduction des connaissances expertes

La connexion déclarée anormale par l’approche comportementale est-elle une attaque (en particulier R2L et U2R)?

Dans KDD’99, ce sont certains attributs relatifs au contenu qui renseignent le plus sur ces deux types d’attaques

24

Normal R2L U2R

Moyenne Écart type Moyenne Écart type Moyenne Écart type

hot 0.045 0.858 7.39 11.947 1.404 1.537

num_failed_logins 0 0.021 0.05 0.255 0.019 0.139

num_compromised 0.029 4.047 0.068 1.392 1.212 1.673

num_root 0.056 4.53 0.099 2.04 0.788 2.304

num_file_creations 0.005 0.203 0.031 0.652 0.788 1.21

num_shells 0 0.021 0.004 0.073 0.135 0.444

num_access_files 0.005 0.081 0.009 0.103 0.019 0.139

num_outbound_cmds 0 0 0 0 0 0

0 1 0 1 0 1

logged_in 28.10% 71.90% 7.73% 92.27% 11.54% 88.46%

root_shell 99.98% 0.02% 99.47% 0.53% 50.00% 50.00%

su_attempted 99.99% 0.01% 100.00% 0.00% 100.00% 0.00%

is_host_login 100.00% 0.00% 100.00% 0.00% 100.00% 0.00%

is_guest_login 99.62% 0.38% 72.11% 27.89% 100.00% 0.00%

25

R2L (% normales) se caractérisent particulièrement par une moyenne élevée de :num_failed_logins, hot, num_compromised, num_root, num_file_creations et

une proportion importante de connexions R2L ayant les attributs logiques logged_in et is_guest_login à 1.

U2R se distinguent par des moyennes élevées de num_failed_logins, num_compromised, num_root, num_file_creations, hot, num_shells et num_access_files,

et une valeur à 1 pour les attributs logged_in et root_shell.

26

Règle de confirmation de la classe normale

Si Dist(c, réference)> seuil alors Si

(num_failed_logins=0)et (num_compromised=0)et (num_root=0)et (num_file_creations=0)et (num_shells=0)et (num_access_files=0)et (num_outbound_cmds=0)et (logged_in=0)et (root_shell=0)et (hot=0) et(su_attempted=0)et (is_host_login=0)et (is_guest_login=0)

Alorsc est un vrai négatif ;

Sinonc est un faux négatif ;

Fin Si ;Fin Si ;

27



vrai négatif ?




Oui

Oui

Non

Non


AD(c)=Normal



28

Traitement des vrais/faux négatifs

Distinction entre attaques DoS/Probe d’un côté et R2L/U2R d’un autre côté

Si ((count >100) ou (duration <=1)) alors

Catégorie(a) ∊ {DoS,Probe};

Sinon Catégorie(a) {R2L, U2R};∊

Fin si;

29


Distinction entre attaques DoS et Probe

Si ((count >100) et (srv_count >50)) ou ((duration <=1) et (dst_host_same_srv_rate >0.718)) alors

Catégorie(a) :=DoS ;Sinon

Si ((count >100) et ((srv_count <=50) ou (dst_host_diff_srv_rate>0.59)) alors Catégorie(a) :=Probe ;

Fin si;Fin si;

Fin si;

30


Distinction entre attaques R2L et U2R

Si (Catégorie(a) ∊ {U2R, R2L}) alors

Si (((root_shell=1)ou(num_root>0)) et (is_guest_login=0)) alors

Catégorie(a) := U2R ;Sinon

Catégorie(a) := R2L ;Fin si ;

Fin si ;

31

Résultats

→ Normal DoS R2L U2R Probe

Normal

97.70% 0.79% 1.06% 0.18% 0.27%

DoS 2.48% 96.85% 0.23% 0.05% 0.39%

R2L 64.86% 0.11% 29.92% 5.11% 0.01%

U2R 69.30% 0.00% 9.21% 20.61% 0.88%

Probe 19.40% 1.75% 3.58% 0.10% 75.18%

PCC= 93.18%

- Dépend des paramètres fixés dans l’approche comportementale - KDD est incohérente

32

Partie 2: Arbre de décision sous observation incertaine

N. BEN AMOR, S. BENFERHAT, Z. ELOUADI

33

Introduction

Standard decision trees are appropriate where attribute values ofobjects to classify are precisely defined.

Problems in handling uncertainty in attributes

Partial values Missing values

34

Example

flag

SF

pjrotocol-type

RSTO

udp

N(path1)

tcp

http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)

protocol-type service flag

tcp 2

udp 1

http 1

domain-u 4

private 1

SF 1

REJ 3

RSTO 1

35

Min/max operators

Apply the minimum operator on the attribute values of each path.

Choose the most plausible path in the tree

Compute the degree of each path.

Apply the maximum operator on the path’s

degrees.Choose the most plausible path.

The instance belongs to the leaf corresponding to this path.

36

Example

flag

SF

pjrotocol-type

RSTO

udp

N(path1)

tcp

http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)


tcp 2

udp 1

http 1

domain-u 4

private 1

SF 1

REJ 3

RSTO 1

37

flag

SF

protocol-type

RSTO

udp

N(path1)

tcp

http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)

2

1 41

3

1 41

Example


tcp 2

udp 1

http 1

domain-u 4

private 1

SF 1

REJ 3

RSTO 1

1 3

1

21

11

4

4

1 1

The instance belongs to P or D.

38

Different ways to classify objects with uncertain/missing

attributes in decision trees

Min/max operators.

Min/leximax operators.

Leximin/leximax operators.

39

Min/leximax operators

A natural extension of the maximum operator.

Let (path1, …, pathn) : The set of different paths.

deg(pathi): the minimum of possibility degrees of attribute values in pathi.

For each class C, we associate a vector C= (C1, ..., Cn):

Ci = deg(pathi) if C is the leaf of pathi

= 0 otherwise

The chosen class will be the min/leximax preferred class.

40

N = (2, 4, 1)D = (3, 1)P = (1, 1, 4)

D and P are leximax preferred to N.P is leximax preferred To D.

P is the class of the new connection

Example

SF

protocol-type

RSTO

udp

N(path1) http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)

2

1 41

3

1 41

1 3

1

21

11

4

4

1 1

flag

tcp

41

Leximin/leximax operators (1)

Two steps:

Using the minimum operator to select a first set of

classes.

Using the maximum or leximax operator to refine this

set. The minimum operator is not selective.

Replace the minimum operator by the leximin one and maintainthe leximax in the second step.

42

Leximin/leximax operators (2)

Establish a total pre-order of all paths using leximin operator based on the gain ratio.

Select a first set of candidate classes corresponding to classes labeling best paths in the total pre-order of paths.

If this set contains more than one class, refine it by selecting its leximax-preferred class(es) using the leximin-leximax order.

43

All paths should be described by the same attributes already defined in the training set.

Since paths are pruned, the idea will be to assign a degree 1 to the missing values.

The application of the leximin requires a pre-order between different attributes.

The use of the gain ratio criterion (gives the highest discriminative power)

Establishing a total leximin pre-order (1)

44

Sort in a decreasing order the different path sets relative to each of the children of the root.

For each path set containing more than one path repeat the same process, recursively. The process will be ended when each induced set path contains only one path.

For equally ranked set of paths, we will group first (resp.second, third, etc.) elements of each of them. Elements belonging to the same group are equally preferred. Then, we re-order paths by considering that the group of first elements is preferred to the one containing second elements and so on.

Establishing a total leximin pre-order (2)

45

Example (1)

SF

protocol-type

RSTO

udp

N(path1) http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)

2

1 41

3

1 41

1 3

1

21

11

4

4

1 1

flag

path5 is the worst set path.

tcp

46

Example (2)

SF

protocol-type

RSTO

udp

N(path1) http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)

2

1 41

3

1 41

1 3

1

21

11

4

4

1 1

flag

tcp

47

Example (3)

SF

protocol-type

RSTO

udp

N(path1) http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)

2

1 41

3

1 41

1 3

1

21

11

4

4

1 1

flag

path4>leximin path2>leximin path3>leximin path1

From the two path sets, path1 is the worst one.

tcp

48

Example (4)

SF

protocol-type

RSTO

udp

N(path1) http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)

2

1 41

3

1 41

1 3

1

21

11

4

4

1 1

flag

path8>leximin path6>leximin path7

tcp

49

Example (5)SF

protocol-type

RSTO

udp

N(path1)

http

P(path2)

N(path3)

domain-uprivate

P(path4)

service

REJ

http

N(path6)

P(path7)

domain-uprivate

D(path8)

serviceD

(path5)

2

1 41

3

1 41

1 3

1

21

11

4

4

1 1

flag

path5 is the worst set path.path4>leximin path2>leximin path3>leximin path1

path8>leximin path6>leximin path7

path8=leximinpath4>leximinpath2=leximinpath6>leximinpath3=leximinpath7>leximinpath1>leximinpath5

tcp

50

Select best class(es)

From the total pre-order, we will select a first set of candidate classes (C) corresponding to classes labeling best paths.

If C contains more than one class we will refine it by selecting its leximax-preferred class(es) using the leximin-leximax order.

51

Example

C = {P, D}

P = (path4, path7, path2) >leximin-leximax D = (path8, path5)

Since path4 =leximin path8 and path7 =leximin path5

The connection will be classified as a probing attack.

We get a more precise class.

path8=leximinpath4>leximinpath2=leximinpath6>leximinpath3=leximinpath7>leximinpath1>leximinpath5

1 Contributions du CRIL dans le cadre de lACI Daddi Présenté par : S. Benferhat...

Documents

Transcript of 1 Contributions du CRIL dans le cadre de lACI Daddi Présenté par : S. Benferhat...