Issre2010 malik

Post on 12-Apr-2017

95 views 1 download

Transcript of Issre2010 malik

1

Pinpointing the Subsystems Responsible for Performance Deviations In a Load Test

Haroon Malik, Bram Adams & Ahmed E. HassanSoftware Analysis and Intelligence Lab (SAIL)

Queen’s University, Kingston, Canada

2

Large scale systems need to satisfy performance constraints

3

TIME SPENT…..

ANALYSTS SPENT CONSIDERABLE TIME DEALING WITH PERFORMANCE BUGS

4

LOAD TESTING

5

Environment Setup Load test execution Load test analysis Report generation

CURRENT PRACTICE1 2 3 4

6

Environment Setup Load test execution Load test analysis Report generation

CURRENT PRACTICE

1 2 3 4

7

2. LOAD TEST EXECUTIONMONITORING TOOL

LOAD GENERATOR- 1 SYSTEM PERFORMANCE REPOSITORY

LOAD GENERATOR- 2

8

Environment Setup Load test execution Load test analysis Report generation

CURRENT PRACTICE

1 2 3 4

9

Environment Setup Load test execution Load test analysis Report generation

CURRENT PRACTICE

1 2 3 4

10

LOAD TEST PASSFAILx

3. LOAD TEST ANALYSIS

MANUAL

12

LARGE NUMBER OF PERFORMANCE COUNTERS

13

LIMITED TIME

14

LIMITED KNOWLEDGE

15

WE CAN HELP ANALYSTS:

Decide if a performance test passed or failed CSMR 2010

Identify the subsystems which violated the performance COMPSAC 2010

Pinpoint the subsystems that are likely the cause of performance violation ISSRE 2010

16

Automated

Methodology to PINPOINT

Likely Cause of performance deviations

17

METHODOLOGY STEPS1 2 3 4

Data Preparation Crafting Performance Signatures Identifying Deviations Pinpointing

18

PERFORMANCE COUNTERS ARE HIGHLY CORRELAED

CPU DISK (IOPS)

NETWORK

MEMORY

TRANSACTONS/SEC

19

Principal Component Analysis (PCA) Explains most of the counter data with minimal

information loss Removes the noise in the counter data

Influential CountersCounter Elimination: Norman cut-off criteria Counter Ranking

2. CRAFTING PERFORMANCE SIGNATURES

Ide

Commits/Sec

Writes/Sec

CPU Utilization

Database Cache % Hit

Subsystems Base-Line Load Test - 1 DeviationMatch

0.41

0

0.01

PINPOINTING

Avg. Pair wise correlation ()SUB- A SUB-B

SUB-C SUB-D

0.8

0.7

0.90.70.50.6

22

4. PINPOINTINGSUB- A SUB-B

SUB-C SUB-D

0.8

0.7

0.90.70.50.6

Avg Pair-wise correlation ()

SUB LoadA 0.75B 0.77C 0.80D 0.77

23

4. PINPOINTINGSUB- A SUB-B

SUB-C SUB-D

0.8

0.7

0.90.70.50.6

Avg Pair-wise correlation ()

SUB Load Baseline Dev %A 0.75 0.87 0.12 13.0B 0.77 0.82 0.05 6.01C 0.80 0.94 0.14 14.8D 0.77 0.88 0.11 12.5

24

4. PINPOINTINGSUB- A SUB-B

SUB-C SUB-D

0.8

0.7

0.90.70.50.6

Avg Pair-wise correlation () Pinpointed

SUB Load Baseline Dev %C 0.80 0.94 0.14 14.8A 0.75 0.87 0.12 13.0D 0.77 0.88 0.11 12.5B 0.77 0.82 0.05 6.01

25

CASE STUDY

26

DELL DVD STOREComponents of Test Environment

Load GeneratorDatabase

ServerWeb Server (B)

Web Server (C)

Web Server (A)

Load

Gen

erat

ors

Performance Logs

Performance Monitoring Tool

27

Base 4-X Dev %Web-1 0.87 0.72 0.15 17.1Web-2 0.88 0.82 0.05 6.46Web-3 0.89 0.83 0.06 7.03DB 0.78 0.73 0.05 6.92

EXPERIMENT-14X- LOAD ON WEB-1

Base CPU Dev %Web-1 0.87 0.69 0.18 21.1Web-2 0.88 0.80 0.08 9.29Web-3 0.89 0.80 0.09 10.6

DB 0.78 0.73 0.05 7.24

Base CPU Dev %Web-1 0.87 0.83 0.03 4.28Web-2 0.88 0.83 0.04 5.04Web-3 0.89 0.84 0.05 6.14DB 0.78 0.78 0.08 10.4

Base MEM Dev %Web-1 0.87 0.81 0.06 7.42Web-2 0.88 0.75 0.13 14.9Web-3 0.89 0.81 0.08 9.49DB 0.78 0.7 0.087 11.0

EXPERIMENT-2CPU STRESS ON WEB-1

EXPERIMENT-3CPU STRESS ON DB

EXPERIMENT-4MEMORY STRESS ON WEB-2

28

1 11 21 31 41 51 61 71 81 91 101 111 121 131 141 151 1610.4

0.5

0.6

0.7

0.8

0.9

1.0 Test- A (Base-line) Test-B Test-C Test-D Test-E

Performance Counters

Coun

ter I

mpo

rtan

ce

ENTERPRISE APPLICATION

29

WHY ARE TEST-D & E DIFFERENT?

30

WHY ARE TEST-D & E DIFFERENT? PINPOINTED – Among 6 subsystems for each test,

web server is likely the cause of performance deviation in both tests

Signature counters of web server notably deviated from the baseline:The packet Outbound Discarded,Packet Sent/SecMessage Queue Length

Network Problem – No connectivity to the databaseWeb server Under Stress

31

LIMITATIONS

Our methodology can only point *A* subsystem that is likely the cause of performance deviation for a load test.

Our methodology can not be generalized to other domains such as network traffic and security monitoring.

32