One Trillion Edgescourse/DBMS/class/bigdata/giraph.pdf · to Apache Giraph ‣Code available as...

Post on 17-Aug-2020

6 views 0 download

Transcript of One Trillion Edgescourse/DBMS/class/bigdata/giraph.pdf · to Apache Giraph ‣Code available as...

Indian Institute of Science Bangalore, India

भारतीय विज्ञान संस्थान

बंगलौर, भारत

One Trillion Edges :

Graph Processing at Facebook Scale A v e r y C h i n g , S e r g e y E d u n o v , M a j a K a b i l j o ,

D i o n y s i o s L o g o t h e t i s , S a m b h a v i M u t h u k r i s h n a n

F a c e b o o k

P r e s e n t e d b y : S w a p n i l G a n d h i

2 1 s t N o v e m b e r 2 0 1 8

2

Konigsberg* Bridge Problem

Its negative resolution by Leonhard Euler in 1736 laid the foundations of graph theory.

* Located in Kingdom of Prussia (Now in Russia)

Graphs are common Web & Social Networks

‣ Web graph, Citation Networks, Twitter, Facebook, Internet

Knowledge networks & relationships ‣ Google’s Knowledge Graph, CMU’s NELL

Cybersecurity ‣ Telecom call logs, financial transactions, Malware

Internet of Things ‣ Transport, Power, Water networks

Bioinformatics ‣ Gene sequencing, Gene expression networks

3

Graph Algorithms

Traversals: Paths & flows between different parts of the graph

‣ Breadth First Search, Shortest path, Minimum Spanning Tree, Eulerian paths, Max-Cut

Clustering: Closeness between sets of vertices ‣ Community detection & Evolution, Connected

components, K-means clustering, Max Independent Set

Centrality: Relative importance of vertices ‣ PageRank, Betweenness Centrality

4

Graphs are Central to Analytics

Raw

Wikipedia

< / > < / > < / > XML

Hyperlinks PageRank Top 20 Pages

Title PR Text

Table

Title Body Topic Model

(LDA) Word Topics

Word Topic

Editor Graph

Community

Detection

User

Community

User Com.

Term-Doc

Graph

Discussion

Table

User Disc.

Community

Topic

Topic Com.

But Graphs can be challenging Shared memory algorithms don’t scale!

Do not fit naturally to Hadoop/MapReduce ‣ Multiple MR jobs (iterative MR)

‣ Topology & Data written to HDFS each time

‣ Tuple, rather than graph-centric, abstraction

Lot of work on parallel graph libraries for HPC ‣ Boost Graph Library, Graph500

‣ Storage & compute are (loosely) coupled, not fault tolerant

‣ But everyone does not have a supercomputer! • If in-case you do own a supercomputer, stick with HPC

algorithms 6

PageRank using MR

7 MapReduce : https://static.googleusercontent.com/media/research.google.com/en//archive/mapreduce-

osdi04.pdf

PageRank using MR MR will run for multiple iterations (Typically 30)

Mapper will ‣ Initially, load adjacency list and initialize default PR ‣ Subsequent iterations will load adjacency list and

new PR ‣ Emit two types of messages from Map

Reducer will ‣ Reconstruct the adjacency list for each vertex ‣ Update the PageRank values for the vertex based

on neighbour’s PR messages ‣ Write adjacency list and new PR values to HDFS, to

be used by next Map iteration

8 SQL v/s MapReduce : http://www.science.smith.edu/dftwiki/images/6/6a/ComparisonOfApproachesToLargeScaleDataAnalysis.p

df

Two Birds

9

Half-caf Double Expresso

Less Data movement over Network

Fault Tolerance

Credits : Google Images

One Stone

10

Pregel

Credits : Google Images

It’s a word play on the English proverb : “Kill two birds with one stone”

Pregel To overcome these challenges, Google came up with Pregel

11

Valiant’s BSP

12

Superstep 1

Superstep 2

P1 P2 P3 P4

Computation

Communication

Barrier Synchronization

Computation

“Often expensive and should be used as sparingly as possible”

Vertex State Machine

13

In superstep 0, every vertex is in the active state.

A vertex deactivates itself by voting to halt.

It can be reactivated by receiving an (external) message.

Algorithm termination is based on every vertex voting to halt.

Vertex Centric Programming

Vertex Centric Programming Model ‣ Logic written from perspective on a single vertex.

‣ Executed on all vertices.

Vertices know about ‣ Their own value(s) ‣ Their outgoing edges

14

15

6 6 2 6

6 6 6 6

6 6 6 6

3 6 2 1 Superstep 0

Superstep 1

Superstep 2

Superstep 3

Active

Voted

to Halt

Message

Finding Largest Value in a Graph using Pregel

Worker

Edges

Advantages

Makes distributed programming easy ‣ No locks, semaphores, race conditions ‣ Separates computing from communication phase

Vertex-level parallelization ‣ Bulk message passing for efficiency

Stateful (in-memory) ‣ Only messages & checkpoints hit disk

16

Lifecycle of a Pregel Program

17 Apache Giraph, Claudio Martella, Hadoop Summit, Amsterdam, April

2014

Applications

18

SSSP class ShortestPathVertex: public Vertex<int, int, int> {

void Compute(MessageIterator* msgs) { int mindist = IsSource(vertex_id()) ? 0 : INF; for ( ; !msgs->Done(); msgs->Next())

mindist = min(mindist, msgs->Value());

if (mindist < GetValue()) {

*MutableValue() = mindist; OutEdgeIterator iter = GetOutEdgeIterator();

for ( ; !iter.Done(); iter.Next())

SendMessageTo(iter.Target(),

mindist + iter.GetValue());

}

VoteToHalt();

}

};

19

In the 0th superstep, only source vertex will

update its value

SSSP (1/6)

20

A

B

C

D

E

H

F

G

1

2 4

1 1

3

2

5 2

2

Input Graph

Worker 1

Worker 2

Worker 3

Worker 4

SSSP (2/6)

21

0

1

2 4

1 1

3

2

5 2

2

Superstep 0

Worker 1

Worker 2

Worker 3

Worker 4

SSSP (3/6)

22

0

1

2

1

2 4

1 1

3

2

5 2

2

Superstep 1

Worker 1

Worker 2

Worker 3

Worker 4

SSSP (4/6)

23

0

1

2

3

4

6

3

1

2 4

1 1

3

2

5 2

2

Superstep 2

Worker 1

Worker 2

Worker 3

Worker 4

SSSP (5/6)

24

0

1

2

3

4

4

3

6

1

2 4

1 1

3

2

5 2

2

Superstep 3

Worker 1

Worker 2

Worker 3

Worker 4

SSSP (6/6)

25

0

1

2

3

4

4

3

6

1

2 4

1 1

3

2

5 2

2

Worker 1

Worker 2

Worker 3

Worker 4

Algorithm has converged

Apache Giraph

26

Platform Improvements (1/2)

Efficient Memory Management (MM) ‣ Vertex and Edge data stored using serialized byte array

‣ Better MM -> Less GC

Support for Multi-Threading ‣ Maximized resource utilization

‣ Linear speed-up for CPU bound applications like K-Means Clustering

27

Platform Improvements (2/2)

Flexible IO Format ‣ Reduces Pre-processing

‣ Allows Vertex and Edge data to be loaded from different sources

Sharded Aggregator ‣ Aggregator responsibilities are balanced across workers

‣ Different Aggregators can be assigned to different workers.

28

29

Refer Class-room Discussion

Beyond Pregel

Master Compute ‣ Allows centralized execution of computation

‣ Refer Class-room Discussion

Worker Phases ‣ Special methods which by-pass Pregel Model, but add

ease of usability

‣ Applicability is application specific

Computation Composability ‣ Decouples Vertex and Computation

‣ Existing Computation implementation can be decoupled for multiple applications

30

Superstep Splitting

Master runs same “Message Heavy” Superstep for fixed number of iterations

For an iteration: ‣ Vertex computation invoked if vertex passes hash

function H ‣ Message sent to destination only if destination passes

hash function H’

Applicable to computation where messages are not “aggregatable”. ‣ If they can be aggregated (commutative and associative)

then stick with Combiners

Example : Friends-of-Friends Computation

31

Key Take-aways

Usability, Performance and scalability improvement to Apache Giraph ‣ Code available as open-source to try out !

Memoir detailing Facebook’s experience of using Giraph for production applications

Headline Grabber :

“Scales to a Trillion Edge graph”

32