Paris AWS User Group | 5th Sep 2018 Cassandra AWS Lambda · PARIS AWS User Group Les AWS User Group...

44
AWS Lambda and Cassandra Paris AWS User Group | 5th Sep 2018 Lyuben Todorov Director of Consulting, EMEA

Transcript of Paris AWS User Group | 5th Sep 2018 Cassandra AWS Lambda · PARIS AWS User Group Les AWS User Group...

AWS LambdaandCassandra

Paris AWS User Group | 5th Sep 2018

Lyuben TodorovDirector of Consulting, EMEA

PARIS AWS User GroupLes AWS User Group permettent aux

utilisateurs d’AWS de communiquer et

échanger pour répondre à des questions,

partager des idées et tout savoir sur les

nouveaux services et les bonnes pratiques.+2000 utilisateurs

http://urlz.fr/7kpV

Me

• Lyuben Todorov

• Consulting Director Instaclustr EMEA

• Univ. of Dundee

• Distributed Programming / OSS

Social Media: /in/lyubent

Talk Overview

Cassandra + λ Scale POC

• λ and C* (Cassandra) introduction

• Why use λ and Instaclustr’s managed service

• High Level Setup of λ and C* in Instaclustr

• Technical Challenges of using λ• Lessons Learned

What is λ

• Serverless

• Pay for execution time (1M requests free)(400k GBsec free)

• Auto-scale

• Always Available

Server

Operation

App

Operation

Operation

Database

Operation

What is λ

• Serverless (no need to share)

• Pay for execution time (1M requests free)(400k GBsec free)

• Auto-scale

• Always Available

λ Operation λ Operation

What is λ

• Serverless (no need to share)

• Pay for execution time (1M requests free)(400k GBsec free)

• Auto-scale

• Always Available

λ Operation λ Operation

User Event

Container Teardown

Container Creation

λ Use-cases

• Peeking Applications

• Event driven applications

• Short Code Execution Times

What is C*

• Highly Available Distributed Database

• No SPOF (p2p architecture)

• Open Source

• Tunable Consistency

Available

Partition Tolerant Consistent

C* Client

• Relevant to lambda:• Gossip – used by client to discover nodes• create λ per DC and use DC Aware Client• Query with LOCAL consistencies• Be careful with client timestamps (due to cold start)

Instaclustr Hosted Service

• Simple

• Auto scaling service

• 24/7 Support

• Access to Analytics

• Dashboard for Monitoring

• Security Plugins

How to set up λ

• Connect λ to Backend

• Deploy and test web app

Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Create VPC Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

VPC Subnet Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Add Some Instant Awesome Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Pick Your Cloud Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Choose Node Capacity & Type Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Scalable backend Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Peering λ and C*’s VPCsLambda’s VPC needs to be connected with Instaclustr’s Cassandra VPC via Instaclustr console:

Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Route Tables

• Add rule for the API Gateway

• Add rule for Instaclustr VPC

Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Create λ in AWS Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Deploy λ Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Deploy λ Create λ VPC

Create subnet for VPC

Provision C* Cluster

VPC Peering Request – C* to λ

Update Route Table

Create & Deploy λ

Architecture

The App

• Allows to process web requests

• POST used for inserting an event

• GET used for fetching an event

• Cassandra Table (Model):

CREATE TABLE event ( id uuid, source text, type text, recorded timestamp, PRIMARY KEY(id) )

The App - API Gateway

• Two resources added (POST and GET)

The App - API Gateway

• POST /event/ writes an event to C*session.execute("INSERT INTO ic.event (id, source, type, recorded)" +"VALUES (now(), '10.1.13.77', 'Auth', toTimestamp(now())");

• GET /event/{id} retrieves an event from C* by id. session.execute("SELECT * FROM ic.event");

The App - Code

• Java Application

• Request processed as stream

• Output as JSON

public void handler(InputStream inputStream, OutputStream outputStream, Context context) {

// IMPL.

// Pass request to either GET or POST depending on context.

}

Challenges

• Application Scalability

• λ Warmup Time

• Reducing Memory Usage

• Connection Pooling

• Dependency Management

• Execution Environment Limits

Scaling Requests

• Load balancer can distribute requests

• Adds Complexity

• What if a backend changes

Scaling Requests with λ• Lambda scales app automatically

• Re-deploy only 1 thing on app update

Scaling Requests with λ• Configure concurrent execution

• Write good app code!

Function Warmup Time

• Cold start is when λ has to initialise resources in order to execute a λ• Container / NIC / other resources.

• Containers torn-down after 15 min of inactivity = cold start after

• λ Function avoids cold-start if constantly running

Function Warmup Time

1 2 3 4 5 6 7

time (min)

Avg. Request Response Time (sec)Parallel Requests (hundreds)

12

10

8

6

4

2

Function Warmup Time

• Cheat – Ping the λ every 5-10 minsCreate a Rule in AWS as an Event and schedule it to run every 10 min.

• Monitor container changes

Reduce Memory Usage

• 512 MB by default

• Way too much for a simple C* client

• CPU is proportional to memory allocated to app

Connection Pool Management

• Creating connections is expensive

• Connection pooling allows reuse

• λ is stateless and asynchronous in nature

Connection Pool Management

• Store session state outside of handler function’s scope

• Variables outside of handler remain initialised across λ calls

// Keep client wrapper outside of handleReqest function // will keep client initialised throughout λ execution private CassandraClient client = new CassandraClient(); public String handleRequest(Map<String,Object> input, Context context) { return "C* Version: " + client.getVersion(); }}

Dependency Management(Java ftw)

• Lean dependencies

• Smaller App

• Faster Deployment

• Less Downtime

<dependency> <groupId>io.symphonia</groupId> <artifactId>lambda-logging</artifactId> <version>1.0.1</version></dependency>

pom.xml

Log4J Jar Size 8.6 MBSymphonia Jar Size 8.1 MBNo Logger Jar Size 7.3 MB

Execution Environment Limits

• Limited to 512 MB of disk

• 3008 MB Memory Limit

• Max timeout – 5 mins.

• Max response payload – 6MB

• Event payload – 128 KB

Per λ invocation

POC Benchmark

• Create client to send out periodically increasing requests

• Run for 7 min 30 sec

• Review Cassandra latency metric

Outcome

Latency 75percentile (μs)Requests

Time (sec)

Q & λ