Oxalide Academy : Workshop #3 Elastic Search

Workshop #3

Elasticsearch, an overview…

Le 10-mar-2016 – Edouard Fajnzilberg & Ludovic Piot

Evénementsles différents événements Oxalide

Workshop #3 - Elasticsearch, an overview…

Les événements Oxalide…

• Objectif : présentation d’une thématique métier ou technique

• Tout public : 80 à 100 personnes

• Déroulé : 1 soir par trimestre de 18h à 21h

• Introduction de la thématique par un partenaire

• Tour de table avec des clients et non clients

• Echange convivial autour d’un apéritif dînatoire

• Objectif : présentation d’une technologie

• Réservé aux clients : public technique avec laptop – 30 personnes

• Déroulé : 1 matinée par trimestre de 9h à 13h

• Présentation de la technologie

• Tuto pour la configuration en ligne de commande

• Objectif : présentation d’un outil• Réservé aux clients : 30 personnes• Déroulé : 1 soir par trimestre de 18h à 21h

• Démonstration des fonctionnalités de l’outil• Echange convivial autour de pizzas

Apérotech

Workshop

Pizza’n’Tools

Workshop #3 - Elasticsearch, an overview…

Les speakers…

Edouard Fajnzilberg

Directeur technique@ kernel42

Ludovic Piot

Team Conseil / Architecture / DevOps@ Oxalide

@lpiot

Introduction

Hands-on #1

découverte d’un cluster de 3 nœuds

Comment ça marche ?

Ecosystème

Hands-on #2

découverte de Marvel & Kibana

Questions & réponses ?

Introduction

Les principaux usages

Introduction

recherche full text instantanée

recherche à la Google

permissif aux variantes orthographiques

recherche performante sur des milliers d’enregistrements

recherche pas limitée à des champs définis

Introduction

recherche sur un critère fixe

recherche sur élément de liste dynamique

recherche sur un périmètre

trier les résultats

limiter le nombre de résultats retournés

paginer les résultats retournés

récupérer le nombre de résultats

restituer des résultats composites

Introduction

dataviz

consultation dynamique

analytics

exploration de données

Introduction

Elasticsearch, pourquoi c’est cool ?

Principales caractéristiques

résultats obtenus instantanément performances linéaires…

haute disponibilité

interactions via API REST, données JSON

librairies clientes

open source

zero configuration

schema free : dynamic field mapping

basé sur Apache Lucene

plugins

Hands-on #1découverte d’un cluster de 3 nœuds

Hands-on #1

Le cluster

Hands-on #1

API REST

verbe HTTP Type de ressources Exemple

Documents

/twitter/tweet/AVNXnwSH24f3KF5HzrfR?pretty

PUT / POST/twitter/tweet/AVNXnwSH24f3KF5HzrfR/_create/twitter/tweet/AVNXnwSH24f3KF5HzrfR?version=1/twitter/tweet/AVNXnwSH24f3KF5HzrfR?version=5&version_type=external

DELETE /twitter/tweet/AVNXnwSH24f3KF5HzrfR

POST Recherche/twitter/tweet/_search/twitter/_search/_search

GETMetadonnées

/twitter/_status/_cluster/status | state | health | settings/nodes | index/_stats/_stats/_search/_cat

POST /_shutdown (supprimé en v2.x)

http://host:port/[index]/[type]/[_action/id] : remember where / what / which

Hands-on #1

Recherche et document JSON

Query DSL (JSON) Document JSON{ "query": {

"filtered": { "query": { "match_all": {} }, "filter": { "and": [ { "range" : { "b" : { "from" : 4, "to" : "8" } }, }, { "term": { "a": "john" } } ]}} }}

{"name": "John Smith","age": 42,"confirmed": true,"join_date": "2014-06-01","home": {

"lat": 51.5, "lon": 0.1

},"accounts": [

{ "type": "facebook", "id": "johnsmith" }, { "type": "twitter", "id": "johnsmith" }

Hands-on #1

Configuration du cluster

Script de démarrage Fichier de configuration

$ cat …/config/elasticsearch.yml# Use a descriptive name for your cluster:cluster.name: elastic-wkshop

# Use a descriptive name for the node:node.name: elastic-wkshop-1

# Path to directory where to store the data:path.data: /es/data

# Path to log files:path.logs: /es/logs

# Lock the memory on startup:bootstrap.mlockall: true

# Set the bind address to a specific IP (IPv4 or IPv6):network.host: 172.31.23.121

# Set a custom port for HTTP:http.port: 9200

# Pass an initial list of hosts to perform discovery when new node is started:discovery.zen.ping.unicast.hosts: ["elastic-wkshop-1", "elastic-wkshop-2", "elastic-wkshop-3"]

# Prevent the "split brain" by configuring the majority of nodes (total number of nodes / 2 + 1):discovery.zen.minimum_master_nodes: 2

$ cat …/bin/elasticsearch

ES_JAVA_OPTS="-Xms8192m -Xmx8192m"ES_HEAP_SIZE="8g"

Terminologie

Relational database ElasticSearch

database index

table type

row document

column field

schema mapping

tablespace / datafile / partition primary shard

SQL Query DSL

Principe de fonctionnement d’un index inversé

par ciel clair, les oiseaux chantent

les oiseaux volent dans le ciel

l’avion bondit vers le ciel, tel un oiseau

Mot Localisation Position

clair 0 1

oiseau

chanter 0 4

voler 1 2

avion 2 1

bondir 2 2

Moteur de recherche et d’indexation

document cleanup tokenize

stop wordstransform

Puisque l’indexation procède à ces transformations, la recherche doit faire de même !

Segments

un index inversé par champ

segment immutable

consolidation des segments au fil de l’eau

Système distribué

Nœuds du cluster

Primary shard

Replicas

Master nodes

Data nodes

Client nodes

Shard routing

Quorum

Système distribué

Cinématique d’écriture

segments immutables

filesystem cache

transaction logs

in-memory buffer

.del file pour delete/update

Mapping

Principes

PUT /[index]/_mapping

Mapping par défaut : {“_default_”: {}}

Dans un même index, tous les champs du même nom DOIVENT avoir le même mapping même si ils appartiennent à

des types différents

Exemple

{ "twitter": { "mappings": { "tweet": { "properties": { "date": { "type": "date", "format": "yyyy-MM-dd" }, "text": { "type": "string", "index": "analyzed" }, "user_id": { "type": "long" } } } } }}

Mapping

Dynamic mapping

Dynamic Field Mapping

Exemple

PUT /twitter{ "mappings": { "tweet": { "dynamic": "true|false|strict",

"date_detection": false } }}

Mapping

Dynamic mapping

Default Mapping

Exemple{ "twitter": { "mappings": { "_default_": { "dynamic_templates": [{ "strings": {

"match_mapping_type": "string", "mapping": { "type": "string", "fields": {

"raw": { "type": "string", "index": "not_analyzed", "ignore_above": 256 }

} }] } } }}

Dynamic Templates

Mapping

Dynamic Mapping

Index Template

Exemple

PUT /_template/template_twitter{ "template" : "twitter-*", "settings" : { "number_of_shards" : 1 }, "mappings" : { "tweet" : { [...] } }}

Mapping

Mise à jour

On peut ajouter un nouveau field

On ne peut pas changer un field existant

Solution

On ne peut pas supprimer un mapping(2.x)

Créer un nouvel index et tout ré-indexer :Scroll Query + Bulk API

Alias d’index :● index_v1● index_v2● index_v3

index => index_v3

PUT /[index]/_alias/[alias]

Aggregations

Comment s’en servir

POST /twitter/tweet/_search{ "query": [...], "aggregations" : { "<aggregation_name>" : { "<aggregation_type>" : { <aggregation_body> } [,"aggregations" : { [<sub_aggregation>]+ } ]? } [,"<aggregation_name_2>" : { ... } ]* }}

Aggregations

Buckets Exemple

Buckets ≈ GROUP BY

Buckets => doc_count

Buckets inside Buckets

{ [...], "aggregations": { "hashtags": { "buckets": [ { "key": "IWD2016", "doc_count": 4 }, { "key": "heforshe", "doc_count": 2 }, { "key": "women", "doc_count": 2 } ] } }}

Aggregations

Metrics Exemple

Metrics ≈ SUM/AVG/MIN/MAX

Metrics inside Buckets

Metrics inside Metrics

{ [...], "aggregations": { "user_follower_stats": { "count": 4871628, "min": 0, "max": 72529214, "avg": 5242.441252493007, "sum": 25539223594 } }}

Aggregations

Mutiple Exemple

{ [...], "aggregations": { "grades_stats": { "count": 6, "min": 60, "max": 98, "avg": 78.5, "sum": 471 },

"user_follower_stats": { "count": 456, "min": 0, "max": 9868, "avg": 78.5, "sum": 785786735 } }}

{ "aggregations": { "grades_stats": { "stats": {

"field": "grades"},

}, "user_follower_stats": { "stats": {

"field": "followers_count"},

Aggregations

Nestable Exemple"aggregations": { "hashtag": { "buckets": [ { "key": "internationalwomensday", "doc_count": 3334427, "retweeted": { "buckets": [ { "key": 0, "doc_count": 1334426 }, { "key": 1, "doc_count": 2000001 } ] } } ] }}

{ "aggregations": { "hashtag": { "terms": { "field": "hastags" }, "aggregations": { "retweeted": { "terms": { "field": "retweeted" } } } } }}

Aggregations

Sortable Exemple

"aggregations": { "hashtag": { "buckets": [ { "key": "a", "doc_count": 64987, }, { "key": "b", "doc_count": 789, }, { "key": "b", "doc_count": 236, } ] }}

{ "aggregations": { "hashtag": { "terms": { "field": "hastag", "order": { "_term": "asc" } } } }}

Aggregations types

Buckets Metrics

Date Histogram

Filter

IPv4 Range

Cardinality

Min / Max

Geo Bounds

Aggregations

{ "aggs":{ "price":{ "histogram":{ "field": "price", "interval": 20000 }, "aggs":{ "revenue": { "sum": { "field" : "price" } } } } }}

Faire des graphiques

Pipeline aggregations

Principe

Appliquer des agrégations sur le résultat des agrégations

“Je veux tous les hashtags qui sont utilisés par au moins 50 utilisateurs

différents”

{ "aggs": { "hashtag": { "terms": { "field": "hashtags" }, "aggs": { "unique_user_count": { "cardinality": { "field": "user.id" } }, "min_unique_user_count": { "bucket_selector": { "buckets_path": { "uniqueUserCount": "unique_user_count" }, "script": "uniqueUserCount > 50" } } } } }}

Ecosystème

Complétion automatique

Coloration syntaxique

Validation syntaxique

Conservation de l’historique

plugin Chrome

plugin Kibana

le iPython Notebook d’ElasticSearch

Ecosystème

Logstash & Beats

ETL en Java

support de plugins

input { twitter { consumer_key => "…" consumer_secret => "…" oauth_token => "…" oauth_token_secret => "…" full_tweet => true keywords => [ "journeedesdroitsdesfemmes", "journeedelafemme" ] }}

filter {}

output { stdout { codec => dots } elasticsearch {

hosts => [ "172.31.23.121" ]index => "twitter"document_type => "tweet"template_name => "tpl_twitter"

configuration en JSON

Beats = framework Go

Ecosystème

Kibana & TimeLion

Ecosystème

Marvel

plugin Kibana

consolidation dans des index ElasticSearch

monitoring du cluster ElasticSearch

agent de métrologie

produit sous souscription

Ecosystème

supportés par Elastic.co

issus de la communauté

Shield

Inquisitor

KopfWatcher

BigDesk

SegmentSpy

Hands-on #2découverte de Marvel & Kibana

Questions & réponses

Ou contactez directement :Maxime KURKDJIAN – Directeur associé

Tel : +33 1 75 77 16 58 / mku Sébastien LUCAS – Directeur associé

Tel : +33 1 75 77 16 59 / slu@oxalide.com

Siège social & NOC :25 Boulevard de Strasbourg – 75010 Paris

Tel : +33 1 75 77 16 66e-mail : commercial@oxalide.com

Oxalide Academy : Workshop #3 Elastic Search

Software

Transcript of Oxalide Academy : Workshop #3 Elastic Search

Constraints on the shallow elastic and anelastic structure ... · Constraints on the shallow elastic and anelastic structure of Mars from InSight seismic data Supplement Materials

Morning Tech#1 BigData - Oxalide Academy

GENERAL HISTOLOGY 3. Connective Tissues · 2017. 3. 20. · 2) Elastic fibers: thin small branching, contain protein elastin. Stretchable, responsible for elastic recoil, present

Heuristc Search techniques

Bbl microservices avec vert.x cdi elastic search

Oxalide Workshop #3 - Elasticearch, an overview

Tutoriel elastic Reactor 3Ds Max 9

Oxalide Workshop #5 - Docker avancé & Kubernetes

[Breizhcamp 2015] MongoDB et Elastic, meilleurs ennemis ?

Search pour Veille Ma V1-5...et déjà intégré ce standard Open Search. Search Server 2008 ainsi que sa version qratuite seront disponibles au Search Server 2 présente quelques

Numerical analysis of two frictionless elastic-piezoelectric … · 2017. 2. 15. · Numerical analysis of two frictionless elastic-piezoelectric contact problems M. Barboteua, J.R.

Workshop Docker & Kubernetes - Oxalide Academy

Variations in elastic thickness in the Canadian Shieldseismo.berkeley.edu/~paudet/Publications_files/AudetEPSL...Canadian Shield. Previous studies aimed at comput-ing the elastic thickness

Déploiement de WordPress avec AWS Elastic Beanstalk · Pour mettre en place cette architecture, vous pouvez tirer profit d'AWS Elastic Beanstalk, un service qui facilite le processus

DEvooxFR Elastic Search

Oxalide MorningTech #1 - BigData

Intelligent Search

Présentation oxalide

Consolidez vos journaux et vos métriques avec Elastic Beats

Full Text Search - Select