SmartDedup: Optimizing Deduplication for Resource ......Virtualized Infrastructures, Systems, &...

24
Virtualized Infrastructures, Systems, & Applications SmartDedup: Optimizing Deduplication for Resource-constrained Devices Qirui Yang Runyu Jin Ming Zhao [email protected], [email protected], [email protected] Arizona State University http://visa.lab.asu.edu

Transcript of SmartDedup: Optimizing Deduplication for Resource ......Virtualized Infrastructures, Systems, &...

  • Virtualized Infrastructures, Systems, & Applications

    SmartDedup: Optimizing Deduplication for Resource-constrained Devices

    Qirui Yang Runyu Jin Ming [email protected], [email protected], [email protected]

    Arizona State University

    http://visa.lab.asu.edu

  • Resource Management on Edge and IoT

    Data-intensive workloadso Multitude of sensorso Data-rich applicationso Multitasking systems

    Limited on-device storageo Limited I/O performanceo Limited capacityo Limited endurance

    2

    • • •

    • • •

  • Deduplication to the Rescue?

    Well-known benefitso Eliminate redundant I/Os à improve performanceo Reduce flash writes à improve enduranceo Remove redundant data à improve utilization

    Widely used in the cloudo Backup storage, primary storage, cache storage, …

    But o Is there enough data duplication in device workloads?

    o How to exploit it using limited resources on the device?

    3SmartDedup

  • Smartphone Trace Analysis

    Real-world device workloads are indeed I/O intensive

    All workloads have a good level of duplicates

    4SmartDedup

    ID Daily I/Os(GB) Daily write

    (GB)# of

    months Duplication

    ratio (%)

    1 12.1 1.4 6 21.9

    2 9.1 2.1 3 45.3

    3 5.9 1 2.5 23.2

    4 16.5 2.4 5 47.5

    5 12.2 2.8 3 41.6

    6 10.6 2.4 2.1 28.3

    File system level traces collected from users from different countries

    Trace: http://visa.lab.asu.edu/traces

  • More In-depth Analysis

    5

    Top 3 duplicate contributors for the traces

    Trace1Trace2Trace3

    Trace4Trace5Trace6

    Five categories:

    Resource files (res)

    Database files (db)

    Executables (exe)

    Temporary files (tmp)

    Multimedia (media)

  • More In-depth Analysis

    Duplicate sources differ by devices

    80% of duplicates are from files that are not entirely duplicate

    System-level deduplication is necessary

    6

    Top 3 duplicate contributors for the traces

    Trace1Trace2Trace3

    Trace4Trace5Trace6

  • Challenges to Device Deduplication

    Limited memoryo Deduplication relies on the in-memory fingerprint store for quickly

    detecting duplicates

    Limited storage performance and enduranceo Deduplication incurs additional metadata operations

    Limited power and energy capacityo Fingerprinting is CPU intensive and draws quite a bit of power

    7SmartDedup

  • SmartDedup—A Smart Deduplication Solution for Smart Devices

    Cohesively designed two-level fingerprint storeso On-disk store complements the small in-memory store

    Synergistically integrated hybrid deduplicationo Out-of-line deduplicates data skipped by in-line

    Dynamically adapted deduplication o According to resource availability

    and workload characteristics

    8SmartDedup

  • In-memory Fingerprint Store

    Store only important fingerprints

    Organized by a prefix treeo Index groups of

    fingerprints — small memory footprint

    o Support the two-level fingerprint store

    9SmartDedup

    In-memoryFingerprint Store

    First 6 bits0 1 2 63…

    Second 6 bits

    0 1 2 63…Second 6 bits

    0 2 63…

    leaf

    prefix=0 prefix=129

    FPa

    FPb

    PBNa

    PBNbFPg PBNg

    In-memoryfingerprint group

    FPm

    FPn

    PBNm

    PBNn…

    FPq PBNq

    In-memoryfingerprint group

    leaf

    1

  • On-disk Fingerprint Store

    Maintain fingerprints that are not in the in-memory store

    Share the same index with in-memory storeo Save memoryo Facilitate the fingerprint

    migration

    10SmartDedup

    On-disk Fingerprint Store

    On-diskfingerprint groupPBNx

    FPh

    FPi

    PBNh

    PBNi…

    PBNyFPm

    FPn

    PBNm

    PBNn…

    FPs PBNs

    In-memoryFingerprint Store

    First 6 bits0 1 2 63…

    Second 6 bits

    0 1 2 63…Second 6 bits

    0 2 63…

    leaf

    prefix=0 prefix=129

    FPa

    FPb

    PBNa

    PBNbFPg PBNg

    In-memoryfingerprint group

    FPm

    FPn

    PBNm

    PBNn…

    FPq PBNq

    In-memoryfingerprint group

    PBNx

    FPk PBNk

    leaf

    1

  • Fingerprints Migration

    Fingerprints migrate between memory and disk on demand

    In-memory store keeps the most recently used fingerprints

    Fingerprints evicted by groups to save I/Os

    11SmartDedup

    On-disk Fingerprint Store

    On-diskfingerprint groupPBNx

    FPh

    FPi

    PBNh

    PBNi…

    PBNyFPm

    FPn

    PBNm

    PBNn…

    FPs PBNs

    In-memoryFingerprint Store

    First 6 bits0 1 2 63…

    Second 6 bits

    0 1 2 63…Second 6 bits

    0 2 63…

    leaf

    prefix=0 prefix=129

    In-memoryfingerprint group

    FPn

    PBNm

    PBNn…

    FPq PBNq

    FPm

    In-memoryfingerprint group

    PBNx

    FPk PBNk

    leaf

    1

    FPb

    PBNa

    PBNb…

    FPt PBNt

    FPa

  • Fingerprints Migration

    Fingerprints migrate between memory and disk on demand

    In-memory store keeps the most recently used fingerprints

    Fingerprints evicted by groups to save I/Os

    12SmartDedup

    On-disk Fingerprint Store

    On-diskfingerprint groupPBNx

    FPh

    FPi

    PBNh

    PBNi…

    PBNyFPm

    FPn

    PBNm

    PBNn…

    FPs PBNs

    In-memoryFingerprint Store

    First 6 bits0 1 2 63…

    Second 6 bits

    0 1 2 63…Second 6 bits

    0 2 63…

    leaf

    prefix=0 prefix=129

    In-memoryfingerprint group

    FPn

    PBNm

    PBNn…

    FPq PBNq

    FPm

    In-memoryfingerprint group

    PBNx

    Free Space

    FPk PBNk

    leaf

    1

    FPb

    PBNa

    PBNb…

    FPt PBNt

    FPa

  • Fingerprints Migration

    Fingerprints migrate between memory and disk on demand

    In-memory store keeps the most recently used fingerprints

    Fingerprints evicted by groups to save I/Os

    13SmartDedup

    On-disk Fingerprint Store

    On-diskfingerprint groupPBNx

    FPh

    FPi

    PBNh

    PBNi…

    PBNyFPm

    FPn

    PBNm

    PBNn…

    FPs PBNs

    In-memoryFingerprint Store

    First 6 bits0 1 2 63…

    Second 6 bits

    0 1 2 63…Second 6 bits

    0 2 63…

    leaf

    prefix=0 prefix=129

    In-memoryfingerprint group

    FPn

    PBNm

    PBNn…

    FPq PBNq

    FPm

    In-memoryfingerprint group

    PBNx

    Free SpaceFPk PBNk

    leaf

    1

    FPb

    PBNa

    PBNb…

    FPt PBNt

    FPa

  • Hybrid Deduplication

    In-line Deduplicationo Work in the I/O patho Immediately eliminate duplicate writes o Limited by the available resources on the device

    Out-of-line Deduplicationo Work in backgroundo Slowly and thoroughly eliminate duplicate data

    o Limited improvement on performance and endurance

    14SmartDedup

  • Shared Fingerprint Index

    Write Path

    15SmartDedup

    write

    Skipped Buffer

    In-memory FingerprintStore

    Grp #1

    Grp #2

    On-disk FingerprintStore

    Grp #1

    Grp #2

    …3

    1 Fingerprinting2

    Search in-memory store

    Miss; pass it to out-of-line

    4Search on-disk fingerprint store

    4Search on-disk store

    5Promote one fingerprint to in-memory store

    If in-memory store full, evict a group of fingerprints to disk

    6

  • Read Path

    Native read path cannot utilize duplicate data in the page cacheo Page cache is indexed by logical block numbers (LBNs)o Cannot avoid reads to different LBNs if the same data is already in the page

    cache

    16

    ReadLBNx

    ReadLBNy

    DATAxPBNx

    DATAxLBNx

    DATAxLBNy

    DiskPage CacheMiss

    Miss

    PBNx

    LBNx LBNyDuplicate Read

  • Optimized Read Path

    Page Cache Indexo Map from PBNs to the corresponding pages in page cacheo Use this index to find data for reads with different LBNs

    17SmartDedup

    ReadLBNx

    ReadLBNy

    DATAxPBNx

    DATAxLBNx

    DiskPage CacheMiss

    PBN-based IndexPBNx->DATAx

    PBNx

    LBNx LBNySave Read

  • Adaptive Deduplication

    Based on resource availabilityo Keep CPU and disk utilization under 100%o Proportional to available battery lifeo Completely disable deduplication in low battery state

    Based on the current duplication levelo Gradually reduce the rate if the duplication level is droppingo Quickly increase the rate if the duplication level is growing

    18SmartDedup

  • Evaluation

    19

    Prototypeso EXT4 and F2FS (4KB fixed-size chunking)

    Testing deviceso Nexus 5X and Raspberry Pi 3

    Benchmarkso FIO—intensive I/O benchmarko DEDISbench—

    workloads from real-world device imageso Real-world device traces

    Baselineso Native EXT4 and F2FSo Dmdedup: block-level deduplication, flexible metadata backends [OLS 14’]o CAFTL: deduplication for resource-constrained FTL layer [FAST 11’]

    Nexus 5X Raspberry Pi 3

    CPU Qualcomm Snapdragon 808

    Broadcom BCM2837

    RAM 2GB 1GB

    Storage 32GB eMMC 16GB SDHC UHS-1

    OS Android Nougat Raspbian Stretch Lite

    Kernel Linux 3.10 Linux 4.4

    File System

    EXT4 F2FS

  • FIO (on Nexus)

    20SmartDedup

    FIO Write FIO Read

    0 20 40 60 80

    100

    0 25 50 75 100Thr

    ough

    put (

    MB/

    s)

    Percentage of Duplicates

    EXT4SmartDedup

    DmdedupCAFTL

    0 30 60 90

    0 25 50 75 100Thro

    ughp

    ut (M

    B/s)

    Percentage of Duplicates

    200 300

    E.g., SmartDedup achieves 16.9% write speedup and 38.2% read speedup vs EXT4 with 25% of duplicates

  • FIO (Resource Usage)

    21

    Small CPU overhead (3.3%) vs. EXT4

    Save battery (49.2%) vs. F2FS

    Small memory footprint: 3.5MB

    CPU Load Battery usage

    0 200 400 600 800

    1000 1200 1400

    Mix 1 Mix 2Ba

    ttery

    Us

    age(

    W *

    s) 0 5

    10 15 20 25 30 35 40

    Mix 1 Mix 2

    CPU

    Load

    (%) EXT4

    EXT4 (SmartDedup)F2FS

    F2FS (SmartDedup)

    I/O(GB)

    Read/write ratio

    Duplication ratio (%)

    Mix 1 4 1 25

    Mix 2 6 2 25

  • Trace Replay (on Nexus)

    22

    0%

    20%

    40%

    60%

    80%

    Segment 1 Segment 2 Segment 3

    Deduplication ratioWrite speedupStorage savingWrite reduction

    ID Trace src

    Write(GB)

    Duplication ratio (%)

    Read/write ratio

    1 4 17.2 75.8 1.5

    2 6 12.4 47.9 2.2

    3 2 9.1 26.4 6.8

    SmartDedup achieveso Up to 51.1% write speedup

    o Up to 70.9% write reduction

  • Adaptive Deduplication (on Nexus)

    23

    Adapting to duplication level of current workload

    0

    200

    400

    600

    800

    0 40 80 120 160

    Po

    we

    r C

    on

    sum

    ptio

    n (

    mW

    )

    Time (s)

    EXT4SmartDedup (basic)

    SmartDedup (adaptive)

    SmartDedup (adaptive)o Saves 14% power

    overheado With only 8% loss

    in deduplication ratio

  • Conclusions

    Deduplication is important to smart devices—performance, utilization, endurance

    SmartDedup achieves significant performance and endurance improvement with low resource cost

    Trace: http://visa.lab.asu.edu/traces

    24SmartDedup

    Acknowledgementso Colleagues @ the ASU VISA Lab o National Science Foundation

    Welcome to SmartDedup’sposter tonight 6:30pm