Grenoble 2011 galtier

Transcriptomique haut-débit pour l'évolution moléculaire et la génétique des populations

CBGP, mars 2011

Nicolas Galtier

UMR 5554 - Institut des Sciences de l'Evolution - Montpellier

galtier@univ-montp2.fr

Molecular evolution in the 21st century

- an enormous amount of data (genomics)

- a robust theoretical framework (population genetics)

⇒ we should understand molecular variation patterns

We have:

Yet we do not really know:

- why some species evolve (much) faster than other, proteome-wise

- why GC-content varies between and across genomes

- by how much population size determines genetic diversity

- etc…

Molecular evolution in the 21st century

Why so many unsolved, basic questions?

- lacking theory

- biased sampling

species

PopPhyl goals

Injecting species biology/ecology into comparative genomics

Exploring the molecular diversity of nonmodel taxa

Testing predictions of the population genetic theory genome-wide

life history traits population genetic parameters

genomic variation data

body mass generation time abundance mating system

mutation rate population size selection recombination

within-species between species

PopPhyl goals

Injecting species biology/ecology into comparative genomics

Exploring the molecular diversity of nonmodel taxa

Testing predictions of the population genetic theory genome-wide

Some specific questions we want to address:

- Why are fast-evolving taxa fast? (mutation, selection) - Are abundant species more polymorphic than scarce ones? - Is selection less efficient in selfers than outcrossers? - How does longevity influence mito vs nuclear DNA evolution? - Who optimizes codon usage, who does gBGC, and why? - Is the rate of selective sweeps higher in large populations?

- Target = transcriptome coding sequences

expression data

- Sampling scheme:

focal species (10 individuals)

outgroups (1 or 2 individuals)

- Next-Generation Sequencing technology

For each taxon: 5.105 400 bp reads (454, pooled individuals) 5.107 100 bp reads (illumina, tagged individuals)

Species sampling

Demosponges Eponges

Cnidaires Cténophores Rotifères Acanthocéphales Entoproctes

Plathelminthes Némertes

Annélides Mollusques Ectoproctes Brachiopodes Chaetognathes Tardigrades Onychophores Arthropodes Loricifères Kinorhynches Priapulides Nématodes Hémichordés Echinodermes Céphalochordés Urochordés Vertébrés

Why are tunicates fast-evolving, proteome-wise?

- higher mutation rate? - more prevalent adaptive evolution ? - relaxed selective constraint on housekeeping genes ?

Data analysis pipeline

Solexa

reference transcriptome assembling

transcriptome reads

mapping

SNPs and genotypes

SNP calling

πN, πS, dN, dS

allele frequencies

coding annot.

Assembling transcriptomes from NGS data: a benchmark in Ciona

Solexa

454 reads

Celera

454 reads

Illumina reads

Cap3 Cap3

c+s c+s

454 reads Illumina reads

merge reads

454 reads Illumina reads

merge contigs

F' - refine

de novo transcriptome assembly: quantitative assessment

data set method contigs mean lg median lg N50 assembly

lg (Mb) touched

A Ciona_454 Celera 25,669 491 438 491 12.6 7616

B Ciona_454 Mira 33,196 635 526 650 21.1 7951

C Ciona_454 Cap3 24,515 671 540 713 16.5 7945

D Ciona_illu Abyss+Cap3 27,426 574 380 769 15.8 7704

E Ciona_mix merge reads 29,097 571 399 721 16.6 7982

F Ciona_mix merge contigs 27,956 726 529 891 20.3 8207

454_Con0gs

Illumina_con0gs

Mix_con0gs

Illumina contigs

454 contigs

Mix contigs

Illumina_contigs

454_contigs

Mix_contigs

140 120

1000 2000 1500

Assembling transcriptomes from NGS data: a benchmark using Ciona intestinalis

no hit

predicted contigs

reference transcriptome

1→1 : full

partial m→1 :

fragments

alleles

1→n : chimera

multi m→n :

full or partial

no hit 1→1

de novo transcriptome assembly: qualitative assessment

Average contig length varies between categories

4000 8000 12000

Improving assemblies by filtering according to length + coverage

number of contigs

correct

de novo transcriptome assembly from NGS data: conclusions

- illumina > 454 (454 useful yet)

- correct cDNA predictions are minoritary in typical assemblies

- existing programs differ substantially in performance (in PopPhyl we retain Cap3 and Abyss)

- contig length + coverage is a reasonable quality criterion

- somewhat variable across species

Data analysis pipeline

Solexa

transcriptome reads

mapping

SNPs and genotypes

SNP calling

πN, πS, dN, dS

allele frequencies

coding annot.

Calling SNPs and genotypes from transcriptome reads

>contig1 pos ind1 ind2 ind3 1 5/0/9/0 0/0/8/0 10/0/0/0 2 0/4/0/0 0/7/0/0 0/17/0/0 3 1/0/0/17 0/0/0/6 0/0/0/22 … >contig2 pos ind1 ind2 ind3 1 0/0/0/4 0/0/0/8 0/2/0/11 2 34/1/13/0 52/0/45/0 4/0/8/0 …

>contig1 pos ind1 ind2 ind3 1 5/0/9/0 AG 0/0/8/0 GG 6/0/0/0 AA 2 0/4/0/0 CC 0/7/0/0 CC 0/17/0/0 CC 3 1/0/0/17 TT 0/0/0/6 TT 0/0/0/5 TT … >contig2 pos ind1 ind2 ind3 1 0/0/0/1 TT 0/0/0/8 TT 0/2/0/11 CT(90%) 2 14/1/9/0 AG 8/0/15/0 AG 12/0/0/0 AA …

genotypes

Model M1 : sequencing error ε

A:1 C:0 G:6 T:0

reads genotype

[GG] 7 ε/3 (1-ε)6

7 (1/2-ε/3)7

Model M2: sequencing error ε and allelic bias α

A:1 C:0 G:6 T:0

reads genotype

[GG] 7 ε (1-3ε)6

7 [q' q''6/2 + q'' q'6/2]

A:0 C:3 G:0 T:16

A:4 C:0 G:1 T:0

A:0 C:19 G:2 T:0

A:8 C:0 G:2 T:1

A:0 C:3 G:12 T:0

Population genomics of a fast-evolver

error rate

allelic bias

0.021 [0.012-0.038]

nb best model 70 (4.6%) 1532 (95.4%)

0.020 [0.011-0.035]

[0.08-0.5]

stop codons 77 (0.26%) 117 (0.39%)

FIT -0.017 -0.054

focal species: Ciona intestinalis B (8 individuals) outgroup: Ciona intestinalis A (reference sequence)

1602 contigs (>10X in >5 individuals), of average length 138 codons

focal species: Ciona intestinalis B (8 individuals) outgroup: Ciona intestinalis A (reference sequence)

Population genomics of a fast-evolver

1602 contigs (>10X in >5 individuals), of average length 138 codons

average πS: 0.057 per site (a highly polymorphic species)

average πN: 0.0026 per site

πN/πS : 0.046 (strong level of purifying selection)

dN/dS : 0.103 (high impact of adaptive evolution)

estimated proportion of adaptive non-synonymous substitutions: 54%

Why are tunicates fast-evolving, proteome-wise?

- higher mutation rate? YES - more prevalent adaptive evolution ? YES - relaxed selective constraint on housekeeping genes ? NO

adaptive

neutral

deleterious

→ large Ne, large µ (per year)

Conclusions

- de novo population genomics from NGS transcriptome data is doable

- transcriptome assembly is probably the most tricky step

- major population genomic descriptors are robust to error models

- life history traits apparently impact molecular evolution to some extant

- long-lived, small population-sized species are the best choice for phylogenomics

VERTEBRES INSECTES

MOLLUSQUES

UROCHORDES CNID.

NEM. NEMATODES

ANNELIDES CRUSTACES SPONG.

- selfers vs outcrossers in snails and nematodes

- long-lived vs short-lived in insects

- big vs small in amniotes phylogeny of turtles

- fast proteic evolution in tunicates and nematodes

- extreme longevity

Subprojects we have started

Thanks to:

Philippe Gayral Vincent Cahais Georgia Tsagkogeorga Marion Ballenghien Zef Melo Ferreira Ylenia Chiari Lucy Weinert

Sylvain Glémin Nico Bierne Khalid Belkhir Fred Delsuc Vincent Ranwez

Guillaume Dugas Sébastien Harispe Caroline Benoist

Grenoble 2011 galtier

Technology

Transcript of Grenoble 2011 galtier

Immunonutrition Gardellin Marianne CHU de Grenoble CHU de Grenoble.

Parler de sexualité aux enfants: expérience en école … · Parler de sexualité aux enfants: expériences en école primaire Stéphanie BOUDET, Enseignante Frédéric GALTIER,

Diffusion sur mobiles en Rhône-Alpes Tour dhorizon Séminaire Sitra 2011 – 29 et 30 septembre - Grenoble.

Prévention des Pneumopathies Acquises sous Ventilation Mécanique Clotilde Schilte Grenoble le 11/02/2011.

Jean-Charles CARTIER DES Néphrologie DESC Réanimation médicale 1 ère année Grenoble, Février 2011 Syndrome de lyse tumorale.

Analyse critique darticle Vincent BRUNOT Interne néphrologie Montpellier DESC réanimation médicale – Grenoble 2011.

THÈSE DOCTEUR DE L’UNIVERSITÉ DE GRENOBLEtima.univ-grenoble-alpes.fr/publications/files/th/2011/aae_0341.pdf · Hatem Zakaria Université de Grenoble xi LIST OF FIGURES Figure

Edition 2011 Grenoble Chiffres-clésIsère - association …clés... · Un territoire à dimension internationale Des infrastructures autoroutières et ferroviaires donnant accès

Modélisation markovienne en phylogénie : contraintes et adaptations moléculaires N. Galtier CNRS UMR 5554 – Institut des Sciences de lEvolution Université

Antibioprophylaxie en chirurgie digestive, rationnel et recommandations actuelles Mourad Marc DESC réanimation médicale Grenoble février 2011.

Supplément Grenoble © L'EXPRESS 2011

Université Grenoble 2 (Pierre Mendès France ) - Grenoble ...ressources.campusfrance.org/guides_etab/accueil/fr/univ_grenoble2... · Université Grenoble 2 (Pierre Mendès France

Mars 2011 Partenaires - locomotive.asso.frlocomotive/images/Photos/Menu-Droite/... · • Ville de Grenoble • La Chorale «Why Notes» ... entreprise multinationale basée à Grenoble,

Infections et grossesse DESC Réanimation Médicale Grenoble – Février 2011 Malik HADDAM Marseille.

Les méthodes probabilistes en phylogénie moléculaire: fondements, usages et controverses Collège de France, Juin 2009 Nicolas Galtier UMR 5554 - Institut.

Grenoble 2011 Sutures & Réparations Méniscalesrhumatologie-bichat.com/Poly Grenoble 2011 Genou/10... · 2011. 12. 28. · Grenoble 2011 Pr E. SERVIEN, Hopital de la croix-rousse.

Inter-académiques Montpellier 2011 Atelier spécialité Proposé par lacadémie de Grenoble.

Modélisation markovienne et phylogénie moléculaire: reconstruction de l'histoire d'un gène N. Galtier CNRS UMR 5171 – "Génome, Populations, Interactions,

Pubalgies E Bouvat - Rhumato.htmlrhumatologie-bichat.com/Poly Grenoble 2011 Genou/20 Bouvat.pdf · BIOMECANIQUE • Adducteur (forts +++) – Long adducteur (magnus, brevis, pectineus,

DOSSIER DE PRESSE - Académie de Grenoble · Le communiqué de presse 3 Le programme complet de la semaine 4 ... Communiqué de presse ± mars 2011 Contact : Géraldine Fabre ± geraldine.fabre@grenoble-univ.fr