alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf ·...

9
1 Alignement multiple Méthodes utilisées Méthodes utilisées optimal progressif itératif combinaison d’approches Estimation de la q alité d’ n alignement ASM2 O. Lecompte – IGBMC Estimation de la qualité d’un alignement Utilisation de l’alignement multiple Objective functions Colonne A Colonne B Colonne C Séquence 1 N N N Séquence 2 N N N Séquence 3 N N N Séquence 4 N N C Séquence 5 N C C 10 paires N-N 6 paires N-N 3 paires N-N 4 i N C 6 i N C Blosum62 N C N 6 -3 C -3 9 Sum-of-pairs (Carrillo, Lipman, 1988) somme des scores de toutes les paires de séquences ASM2 O. Lecompte – IGBMC norMD (Thompson et al, 2001) - scores par colonne - normalisation suivant les séquences à aligner (nombre, longueur, similarité) 4 paires N-C 6 paires N-C 1 paire C-C 10x6=60 6x6+4x(-3)=24 3x6+6x(-3)+9=9

Transcript of alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf ·...

Page 1: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

1

Alignement multiple

Méthodes utilisées Méthodes utilisées optimal progressif itératif combinaison d’approches

Estimation de la q alité d’ n alignement

ASM2O. Lecompte – IGBMC

Estimation de la qualité d’un alignement

Utilisation de l’alignement multiple

Objective functions

Colonne A Colonne B Colonne CSéquence 1 N N NSéquence 2 N N NSéquence 3 N N NSéquence 4 N N CSéquence 5 N C C

10 paires N-N 6 paires N-N 3 paires N-N4 i N C 6 i N C

Blosum62

N CN 6 -3C -3 9

Sum-of-pairs (Carrillo, Lipman, 1988)somme des scores de toutes les paires de séquences

ASM2O. Lecompte – IGBMC

norMD (Thompson et al, 2001) - scores par colonne - normalisation suivant les séquences à aligner (nombre, longueur, similarité)

4 paires N-C 6 paires N-C1 paire C-C

10x6=60 6x6+4x(-3)=24 3x6+6x(-3)+9=9

Page 2: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

2

1gln1exd

Objective Functions - norMD

Archeal/Eukaryotic

GluRS+

GlnRS

KHLKATG-GKVLTRFPPEPNGYLHIGHAKAMFVDFGLAKDRNGGCYLRFDDTNP--EAEKKEYIDHIEEIVQWMGWEPF----------KITYTSNYFQELYEFAVELIRRGHAYVDHQTADEIKEYR----------EKKLNSPWRDRPISESLKLFEDMRR-GFIEEGKATLRMKQDMQSDNYNMY--------------------DLIAYRIKFTP---HPHAGDKWCIYPSYDYAHCIVDSIENVTHSLCTLEFETRRASYYWLLHALGIY-----QPYVWEYSR-LNVS-NTVMSKRKLNRLVTEK--WVDGWDDQHLEITG-GQVRTRFPPEPNGILHIGHAKAINFNFGYAKANNGICFLRFDDTNP--EKEEAKFFTAICDMVAWLGYTPY----------KVTYASDYFDQLYAWAVELIRRGLAYVCHQRGEELKGHN------------TLPSPWRDRPMEESLLLFEAMRK-GKFSEGEATLRMKLVMEDGKM-----------------------DPVAYRVKYTP---HHRTGDKWCIYPTYDYTHCLCDSIEHITHSLCTKEFQARRSSYFWLCNALDVY-----CPVQWEYGR-LNLH-YAVVSKRKILQLVATG--AVRDWDDEDLASGKHTTVHTRFPPEPNGYLHIGHAKSICLNFGIAQDYKGQCNLRFDDTNP--VKEDIEYVESIKNDVEWLGFHWSG---------NVRYSSDYFDQLHAYAIELINKGLAYVDELTPEQIREYRGTL------TQPGKNSPYRDRSVEENLALFEKMRA-GGFEEGKACLRAKIDMASPFIVMR--------------------DPVLYRIKFAE---HHQTGNKWCIYPMYDFTHCISDALEGITHSLCTLEFQDNRRLYDWVLDNITIP----VHPRQYEFSR-LNLE-YTVMSKRKLNLLVTDK--HVEGWDDEDLASGKHKSVHTRFPPEPNGYLHIGHAKSICLNFGLAKEYQGLCNLRFDDTNP--VKEDVEYVDSIKADVEWLGFKWEG---------EPRYASDYFDALYGYAVELIKKGLAYVDELSPDEMREYRGTL------TEPGKNSPYRDRTIEENLALFEKMKN-GEFAEGKASLRAKIDMASPFMVMR--------------------EPVIYRIKFSS---HHQTGDKWCIYPMYDFTHCISDAIERITHSICTLEFQDNRRLYDWVLENISIER---PLPHQYEFSR-LNLE-GTLTSKRKLLKLVNDE--IVDGWND-ELP-NVKDKVVMRFAPNPSGPLHIGHARAAVLNDYFVKKYGGKLILRLEDTDP--KRVLPEAYDMIKEDLDWLGVKVD----------EVVIQSDRIELYYEYGRKLIEMGHAYVCDCNPEEFRELR----------NKGVPCKCRDRAIEDNLELWEKMLN-GELEN--VAVRLKTDIKHKNPSIR--------------------DFPIFRVEKTP---HPRTGDKYCVYPLMNFSVPVDDHLLGMTHVLRGKDHIVNTEKQAYIYKYFGWE-----MPEFIHYGI-LKIE-DIVLSTSSMYKGIKEG--LYSGWDDRELA-GVKGEVVLRFAPNPSGPLHIGHARAAILNHEYARKYDGRLILRIEDTDP--RRVDPEAYDMIPADLEWLGVEWD----------ETVIQSDRMETYYEYTEKLIERGGAYVCTCRPEEFRELK----------NRGEACHCRSLGFRENLQRWREMFE---MKEGSAVVRVKTDLNHPNPAIR--------------------DWVSMRIVEAE---HPRTGTRYRVYPMMNFSVAVDDHLLGVTHVLRGKDHLANREKQEYLYRHLGWE-----PPEFIHYGR-LKMD-DVALSTSGAREGILRG--EYSGWDDRNLP-DVKGEVVLRFAPNPSGPLHIGHARAAILNHEYARRYDGKLILRIEDTDP--RRVDPEAYDMIPSDLEWLGVEWD----------ETIIQSDRMEIYYEYTERLIERGGAYVCTCTPEAFREFK----------NEGKACHCRDLGVRENLQRWREMFE---MPEGSAVVRVKTDLQHPNPAIR--------------------DWVSMRIVEAE---HPRTGTRYRVYPMMNFSVAVDDHLLGVTHVLRGKDHLANSEKQEYLYRHLGWE-----PPVFIHYGR-LKMD-DIALSTSGAREGIVEG--KYSGWDDPLLPKAEKGKVVTRFAPNPDGAFHLGNARAAILSYEYAKMYGGKFILRFDDTDPKVKRPEPIFYKMIIEDLEWLGIKPD----------EIVYASDRLEIYYKYAEELIKMGKAYVCTCPPEKFRELR----------DKGIPCPHRDEPVEVQLERWKKMLN-GEYKEGEAVVRIKTDLNHPNPAVR--------------------DWPALRIIDNPN--HPRTGNKYRVWPLYNFASAIDDHELGVTHIFRGQEHAENETRQRYIYEYFGWE-----YPVTIHHGR-LSIE-GVVLSKSKTRKGIEEG--KYLGWDDPPLPKAEKGKVVTRFAPNPDGAFHLGNARAAILSYEYAKMYGGKFILRFDDTDPKVKRPEPIFYEMIIEDLEWLGIKPD----------EIVYASDRLELYYKYAEELIKMGKAYVCTCKPEKFRELR----------DKGIPCPHRDEPVEVQLERWRKMLN-GEYKEGEAVVRIKTDLNHPNPAVR--------------------DWPALRIVDNPN--HPRAGNKYRVWPLYNFASAIDDHELGVTHIFRGQEHAENETRQRYIYEYFGWE-----YPVTVHHGR-LSIE-GVILSKSKTRKGIEEG--KYLGWDDPELEGAEKGKVVMRFAPNPNGPPTLGSARGIIVNGEYAKMYEGKYIIRFDDTDPRTKRPMIEAYEWYLEDIEWLGYKPD----------EVIYASRRIPIYYDYARKLIEMGKAYTCFCSQEEFKKFR----------DSGEECPHRNISVEDTLEVWERMLE-GDYEEGEVVLRIKTDMRHKDPAIR--------------------DWVAFRIIKES---HPLVGDKYVVYPTLDFESAIEDHLLGITHIIRGKDLIDSERRQRYIYEYFGWI-----YPITKHWGR-VKIFEFGKLSTSSIKKDIERG--KYEGWDDPPLPGAVEGRVKLRFAPNPDFVIHMGNARPAIVNHEYARMYKGRMVLRFEDTDPRTKTPLREAYDLIRQDLKWLGVSWD----------EEYIQSLRMEVFYSVARRAIERGCAYVDNCGRE-GKELL----------SRGEYCPTRDLGPEDNLELFEKMLE-GEFYEGEAVVRMKTDPRHPNPSLR--------------------DWVAMRIIDTEKHPHPLVGSRYLVWPTYNFAVSVDDHMMEITHVLRGKEHQLNTEKQLAVYRCMGWR-----PPYFIHFGR-LKLE-GFILSKSKIRKLLEERPGEFMGYDDPPLP-NVKGQVVTRFAPNPDGPLHLGNARSAILSYEYAKMYNGKFILRFDDTDPKVKRPILDAYDWIKEDLKWLGIKWE----------QELYASERLELYYKYARYLIEKGYAYVDTCDSSIFRKFRDSRGK-----MKEPECLHRSSSPESNLELFEKMLG-GKFKEGEAVVRLKTDLSDPDPSQI--------------------DWVMLRIIDTAKNPHPRVGSKYWVWPTYNFASIIDDHELGITHVLRAKEHMSNTEKQRYISEYMGWE-----FPEVLQFGR-LRLE-GFMMSKSKIRGMLEKG----TNRDDVELPGAEMGKVTVRFPPEASGYLHIGHAKAALLNQHYQVNFKGKLIMRFDDTNP--EKEKEDFEKVILEDVAMLHIKPD----------QFTYTSDHFETIMKYAEKLIQEGKAYVDDTPAEQMKAER----------EQRIESKHRKNPIEKNLQMWEEMKK-GSQFGHSCCLRAKIDMSSNNGCMR--------------------DPTLYRCKIQP---HPRTGNKYNVYPTYDFACPIVDSIEGVTHALRTTEYHDRDEQFYWIIEALGIR-----KPYIWEYSR-LNLN-NTVLSKRKLTWFVNEG--LVDGWDDVELPGAEKGKVVVRFPPEASGYLHIGHAKAALLNQYYQQAFEGQLIMRFDDTNP--AKENAHFEHVIKEDLSMLNIVPD----------RWTHSSDHFEMLLTMCEKLLKEGKAFVDDTDTETMRNER----------EQRQDSRNRSNTPEKNLQLWEEMKK-GSPKGLTCCVRMKIDMKSNNGAMR--------------------DPTIYRCKPEE---HVRTGLKYKVYPTYDFTCPIVDSVEGVTHALRTTEYHDRDDQYYFICDALGLR-----RPHIWEYAR-LNMT-NTVMSKRKLTWFVDEG--HVEGWDDVDLPGAEMGKVVVRFPPEASGYLHIGHAKAALLNQYYALVCQGTLIMRFDDTNP--AKETVEFENVILGDLEQLQIKPD----------VFTHTSNYFDLMLDYCVRLIKESKAYVDDTPPEQMKLER----------EQRVESANRSNSVEKNLSLWEEMVK-GSEKGQNTACAAKIDMSSPNGCMR--------------------DPTIYRCKNEP---HPRTGTKYKVYPTYDFACPIVDAIENVTHTLRTTEYHDRDDQFYWFIDALKLR-----KPYIWSYSR-LNMT-NTVLSKRKLTWFVDSG--LVDGWDDIGLPDAIDGKVVTRFPPEPSGYLHIGHAKAALLNQYFANKYHGKLIVRFDDTNP--SKENSEFQDAILEDVALLGIKPD----------VVTYTSDYLDTIHQYCVDMIKSGQAYADDTDVETMRHER----------TEGIPSKHRDRPIEESLEILSEMDK-GSDVGLKNCIRAKISYENPNKAMR--------------------DPVIYRCNLLP---HHRTGTKYRAYPTYDFACPIVDSLEGVTHALRTTEYRDRNPLYQWMIKAMNLR-----KIHVWEFSR-MNFV-RTLLSKRKLTEIVDHG--LVWGWDDIDLPDAKMGEVVTRFPPEPSGYLHIGHAKAALLNQYFAQAYKGKLIIRFDDTNP--SKEKEEFQDSILEDLDLLGIKGD----------RITYSSDYFQEMYDYCVQMIKDGKAYCDDTPTEKMREER----------MDGVASARRDRSVEENLRIFTEEMKNGTEEGLKNCVRAKIDYKALNKTLR--------------------DPVIYRCNLTP---HHRTGSTWKIYPTYDFCVPIVDAIEGVTHALRTIEYRDRNAQYDWMLQALRLR-----KVHIWDFAR-INFV-RTLLSKRKLQWMVDKD--LVGNWDDVDLPEAEIGKVKLRFAPEPSGYLHIGHAKAALLNKYFAERYQGEVIVRFDDTNP--AKESNEFVDNLVKDIGTLGIKYE----------KVTYTSDYFPELMDMAEKLMREGKAYVDDTPREQMQKER----------MDGIDSKCRNHSVEENLKLWKEMIA-GSERGLQCCVRGKFNMQDPNKAMR--------------------DPVYYRCNPMS---HHRIGDKYKIYPTYDFACPFVDSLEGITHALRSSEYHDRNAQYFKVLEDMGLR-----QVQLYEFSR-LNLV-FTLLSKRKLLWFVQTG--LVDGWDDIKEDIHPSLPVRTRFAPSPTGFLHLGSLRTALYNYLLARNTNGQFLLRLEDTDQ--KRLIEGAEENIYEILKWCNINYDET---------PIKQSERKLIYDKYVKILLSSGKAYRCFCSKERLNDLRHSAMELKPPSMASYDRCCAHLGEEEIKSKLAQ--------GIPFTVRFKSP-ERYPTFTDLLHGQINLQPQVNFNDKRYDDLILVKSD---------------KLPTYHLANVVDDHLMGITHVIRGEEWLPSTPKHIALYNAFGWA-----CPKFIHIPLLTTVG-DKKLSKRKGD----------------------MTTVRTRIAPSPTGDPHVGTAYIALFNLCFARQHGGQFILRIEDTDQ--LRSTRESEQQIYDALRWLGIEWDEGPDVGGP-HGPYRQSERGHIYKRYSDELVEKGHAFTCFCTPERLDAVRAEQMARK--ETPRYDGHCMHLPKDEVQRRLAA--------GESHVTRMKVPTEGVCVVPDMLRGDVEIPWDRMD------MQVLMKAD---------------GLPTYFLANVVDDHLMGITHVLRGEEWLPSAPKLIKLYEYFGWE-----QPQLCYMPLLRNPD-KSKLSKRKNP--------------------MADSAVRVRIAPSPTGEPHVGTAYIALFNYLFAKKHGGKFILRIEDTDA--TRSTPEFEKKVLDALKWCGLEWSEGPDIGGP-YGPYRQSDRKDIYKPYVEKIVANGHGFRCFCTPERLEQMREAQRAAG--KPPKYDGLCLSLSAEEVTSRVDA--------GEPHVVRMKIPTEGSCKFRDGVYGDVEIPWEAVD------MQVLLKAD---------------GMPTYHMANVVDDHLMKITHVARGEEWLASVPKHILIYQYLGLE-----PPVFMHLSLMRNAD-KSKLSKRKNP--------------------MAWENVRVRVAPSPTGDPHVGTAYMALFNEIFAKRFNGKMILRIEDTDQ--TRSRDDYEKNIFSALQWCGIQWDEGPDIGGP-HGPYRQSERTEIYREYAELLLKTDYAYKCFATPKELEEMRAVATTLG--YRGGYDRRYRYLSPEEIEARTQE--------GQPYTIRLKVPLTGECVLEDYCKGRVVFPWADVD------DQVLMKSD---------------GFPTYHFANVVDDHLMGITHVLRGEEWLSSTPKHLLLYEAFGWE-----PPIFLHMPLLLNPD-GTKLSKRKNP----------------------MEKIRTRYAPSPTGYLHVGGTRTAIFNFLLAKHFNGEFIIRIEDTDT--ERNIKEGINSQFDNLRWLGVIADESVYNPGN-YGPYLQSQKLAVYKKLAFDLIEKNLAYRCFCSKEKLESDRKQAINNH--KTPKYLGHCRNLHSKKITNHLEK--------NDPFTIRLKINNEAEYSWNDLVRGQITIPGSALT------DIVILKAN---------------GVATYNFAVVIDDYDMEITDVLRGAEHISNTAYQLAIYQALGFKR----IPRFGHLSVIVDES-GKKLSKRDEKTT--------------------MEKIRTRYAPSPTGYLHVGGARTAIFNFLLAKHFNGEFIIRIEDTDT--ERNVEGGIESQLENLRWLGIIPDESIYNPGN-YGPYIQSQKLATYKKLAYELVGKGLAYRCFCTKEKLEHERQLALEHH--QTPKYLGTCRNLHSKHIQTNLDN--------QVPFTIRLKINQDAEFAWNDQVRGKITIPGNSLT------DIVLLKAN---------------GIATYNFAVVIDDHDMEITDVLRGAEHISNTAYQLAINQALGYQR----IPRFGHLSVIVDKS-GKKLSKRDTKTI--------------------MKKLRTRYAPSPTGYLHIGGARTALFNYLLAKHYNGDFIIRIEDTDV--KRNIADGEASQIENLKWLNIEANESPLKPNEKYGPYRQSQKLEKYLKIAHELIEKGYAYKAYDNSEELEEQKKHSEKLG-VASFRYQRDFLKISEEEKQKRDAS--------G-AYSIRVICPKNTTYQWDDLVRGNIAVNSNDIG------DWIIIKSD---------------DYPTYNFAVVIDDIDMEISHILRGEEHITNTPKQMMIYDYLNAP-----KPLFGHLTIITNME-GKKLSKRDLSLK----------------------MVVTRIAPSPTGDPHVGTAYIALFNYAWARRNGGRFIVRIEDTDR--ARYVPGAEERILAALKWLGLSYDEGPDVAAP-TGPYRQSERLPLYQKYAEELLKRGWAYRAFETPEELEQIRKEK--------GGYDGRARNIPPEEAEERARR--------GEPHVIRLKVPRPGTTEVKDELRGVVVYDNQEIP------DVVLLKSD---------------GYPTYHLANVVDDHLMGVTDVIRAEEWLVSTPIHVLLYRAFGWE-----APRFYHMPLLRNPD-KTKISKRKSH---------------ASADSGGSGPVRVRFAPSPTGNLHVGGARTALFNYLFARSRGGKFVLRVEDTDL--ERSTKKSEEAVLTDLSWLGLDWDEGPDIGGD-FGPYRQSERNALYKEHAQKLMESGAVYRCFCSNEELEKMKETANRMK--IPPVYMGKWATASDAEVQQELEK--------GTPYTYRFRVPKEGSLKINDLIRGEVSWNLNTLG------DFVIMRSN---------------GQPVYNFCVTVDDATMRISHVIRAEEHLPNTLRQALIYKALGFA-----MPLFAHVSLILAPD-KSKLSKRHGA---------------VYASAGDGGDVRVRFAPSPTGNLHVGGARTALFNYLYARAKGGKFILRIEDTDL--ERSTKESEEAVLRDLSWLGPAWDEGPGIGGE-YGPYRQSERNALYKQFAEKLLQSGHVYRCFCSNEELEKMKEIAKLKQ--LPPVYTGRWASATEEEVVEELAK--------GTPYTYRFRVPKEGSLKIDDLIRGEVSWNLDTLG------DFVIMRSN---------------GQPVYNFCVTVDDATMAISHVIRAEEHLPNTLRQALIYKALGFP-----MPHFAHVSLILAPD-RSKLSKRHGA------------------------MVRVRFAPSPTGFLHVGGARTALFNFLFARKEKGKFILRIEDTDL--ERSEREYEEKLMESLRWLGLLWDEGPDVGGD-HGPYRQSERVEIYREHAERLVKEGKAYYVYAYPEEIEEMREKLLSEG--KAPHYSQEMFEKFDTPERRREYEEK------GLRPAVFFKMPR-KDYVLNDVVKGEVVFKTGAIG------DFVIMRSN---------------GLPTYNFACVVDDMLMEITHVIRGDDHLSNTLRQLALYEAFEKA-----PPVFAHVSTILGPD-GKKLSKRHGA-----------------MASASGSPVRVRFCPSPTGNPHVGLVRTALFNWAFARHHQGTLVFRIEDTDA--ARDSEESYDQLLDSMRWLGFDWDEGPEVGGP-HAPYRQSQRMDIYQDVAQKLLDAGHAYRCYCSQEELDTRREAARAAG--KPSGYDGHCRELTDAQVEEYTSQ--------GREPIVRFRMPDE-AITFTDLVRGEITYLPENVP------DYGIVRAN---------------GAPLYTLVNPVDDALMEITHVLRGEDLLSSTPRQIALYKALIELGVAKEIPAFGHLPYVMGEG-NKKLSKRDPQ--------------------MANKKIRVRYAPSPTGHLHIGNARTALFNYLFARHNKGTLVLRIEDADT--ERNVEGGAESQIENLHWLGIDWDEGPDIGGD-YGPYKQSERKDIYQKYIDQLLEEGKAYYSFKTEEELEAQREEQRAMG--IAPHYVYEYEGMTTDEIKQAQAEARAK----GLKPVVRIHIPEGVTYEWDDIVKGHLSFESDTIG-----GDFVIQKRD---------------GMPTYNFAVVIDDHLMEISHVLRGDDHISNTPKQLCVYEALGWE-----APVFGHMTLIINSATGKKLSKRDESVL-------------------MGNEVRVRYAPSPTGHLHIGNARTALFNYLFARNQGGKFIIRVEDTDK--KRNIEGGEQSQLNYLKWLGIDWDESVDVGGE-YGPYRQSERNDIYKVYYEELLEKGLAYKCYCTEEELEKEREEQIARG--EMPRYSGKHRDLTQEEQEKFIAE--------GRKPSIRFRVPEGKVIAFNDIVKGEISFESDGIG------DFVIVKKD---------------GTPTYNFAVAIDDYLMKMTHVLRGEDHISNTPKQIMIYQAFGWD-----IPQFGHMTLIVNES-RKKLSKRDESII-------------------MAKDVRVGYAPSPTGHLHIGGARTALFNYLFARHHGGKMIVRIEDTDI--ERNVEGGEQSQLENLQWLGIDYDESVDKDGG-YGPYRQTERLDIYRKYVDELLEQGHAYKCFCTPEELEREREEQRAAG-IAAPQYSGKCRRLTPEQVAELEAQ--------GKPYTIRLKVPEGKTYEVDDLVRGKVTFESKDIG------DWVIVKAN---------------GIPTYNFAVVIDDHLMEISHVFRGEEHLSNTPKQLMVYEYFGWE-----PPQFAHLTLIVNEQ-RKKLSKRDESII-------------TSDGTPQAAKVRVRFCPSPTGVPHVGMVRTALFNWAYARHTGGTFVLRIEDTDA--DRDSEESYLALLDALRWLGLNWDEGPEVGGP-YGPYRQSQRTDIYREVVAKLLATGEAYYAFSTPEEVENRHLAAGRNP---KLGYDNFDRDLTDAQFSAYLAE--------GRKPVVRLRMPDE-DISWDDLVRGTTTFAVGTVP------DYVLTRAS---------------GDPLYTLVNPCDDALMKITHVLRGEDLLSSTPRQVALYQALIRIGMAERIPEFGHFPSVLGEG-TKKLSKREPQ-----------------------MSTRVRYAPSPTGLQHIGGIRTALFNYFFAKSCGGKFLLRIEDTDQ--SRYSPEAENDLYSSLKWLGISFDEGPVVGGD-YAPYVQSQRSAIYKQYAKYLIESGHAYYCYCSPERLERIKKIQNINK--MPPGYDRHCRNLSNEEVENALIK--------KIKPVVRFKIPLEGDTSFDDILLGRITWANKDIS-----PDPVILKSD---------------GLPTYHLANVVDDYLMKITHVLRAQEWVSSGPLHVLLYKAFKWK-----PPIYCHLPMVMGND-GQKLSKRHGS---------------APFNLDPNVKVRTRFAPSPTGYLHVGGARTALYSWLYAKHNNGEFVLRIEDTDL--ERSTPEATAAIIEGMEWLNLPWEH---------GPYYQTKRFDRYNQVIDEMIEQGLAYRCYCTKEHLEELRHTQEQNK--EKPRYDRHCLHDH-NHSP-------------DEPHVVRFKNPTEGSVVFDDAVRGRIEISNSELD------DLIIRRTD---------------GSPTYNFCVVVDDWDMGITHVVRGEDHINNTPRQINILKAIGAP-----IPTYAHVSMINGDD-GQKLSKRHGA-----------------------MKIKTRFAPSPTGYLHVGGARTALYSWLFARNHGGEFVLRIEDTDL--ERSTPEAIEAIMDGMNWLSLEWDE---------GPYYQTKRFDRYNAVIDQMLEEGTAYKCYCSKERLEALREEQMAKG--EKPRYDGRCRHSHEHHAD-------------DEPCVVRFANPQEGSVVFDDQIRGPIEFSNQELD------DLIIRRTD---------------GSPTYNFCVVVDDWDMEITHVIRGEDHINNTPRQINILKALKAP-----VPVYAHVSMINGDD-GKKLSKRHGA---------------

1exd

BacterialGluRS

‘HIGH’ ‘KMSKS’H8

PRLMTLAGLRRR-GMTPTAINAFVRGMGI---------------------------TRSDGTLISVERLEYHVREELNK-TPRLFTLTALRRR-GFPPEAINNFCARVGV---------------------------TVA-QTTMEPHLLEACVRDVLND-TPRMPTISGLRRR-GYTAASIREFCKRIGV---------------------------TKQ-DNTIEMASLESCIREDLNE-NPRMPTISGLRRR-GYTPASLREFCRRIGV---------------------------TKQ-DNVVEYSALEACIREDLNE-NVRLGTLRALRRR-GIKPEAIYEIMKRIGI---------------------------KQA-DVKFSWENLYAINKELIDK-DPRLGTLRAIARR-GIRPEAIRKLMVEIGV---------------------------KIA-DSTMSWKKIYGLNRSILEE-EPRLGTIRAIARR-GIRSDAIRKLMVEIGV---------------------------KIA-DSTMSWKKIYGLNRNILEE-EPRLGTIRALRRR-GILPEAIKELIIEVGL---------------------------KKS-DATISWENLAAINRKLVDP-IPRLGTIRALRRR-GILPEAIKELIIEVGL---------------------------KKS-DATVSWDNLAAINRKLVDP-IPRLPTLRAFRRR-GFEPEAIKSFFLSLGV---------------------------GEN-DVSVSLKNLYAENRKIIDR-KPRFGTIAGLRRR-GVLAEAIRQIILEVGV---------------------------KPT-DATISWANLAAANRKLLDE-RPRLPTLAGLRRR-GILPDTIKDVIIDVGV---------------------------KVT-DATISFENIAAINRKKLDP-VPRFPTVRGVLRR-GMTVEGLKQFIAAQGS---------------------------SRS-VVNMEWDKIWAFNKKVIDP-VPRLPTVRGVMRR-GLTVEGLKQFIVAQGG---------------------------SRS-VVMMEWDKIWAFNKKVIDP-VPRFPTVRGIIRR-GMTVEGLKEFIIAQGS---------------------------SKS-VVFMNWDKIWAFNKKVIDP-IPRFPTVRGVRRR-GMTIEALQQYIVSQGP---------------------------SKN-ILTLDWTSFWATNKKIIDP-VPRFPTVRGVRRR-GMTVEGLRNFVLSQGP---------------------------SRN-VINLEWNLIWAFNKKVIDP-IPRFPTVQGIVRR-GLKIEALIQFILEQGA---------------------------SKN-LNLMEWDKLWSINKRIIDP-V---MSISDLKRQ-GVLPEALINFCALFGWSPPRDLASKKHECFSMEELETIFNLNGLTKGNAKVDDKKLWFFNKHFLQKRI---TSITFYERM-GYLPQALLNYLGRMGWSMP-----DEREKFTLAEMIEHFDLSRVSLGGPIFDLEKLSWLNGQWIREQS---TSISYYTAL-GYLPEALMNFLGLFFIQIA-----EGEELLTMEELAEKFDPENLSKAGAIFDIQKLDWLNARWIREKL---TSIFYYRDA-GYIKEAFMNFLTLMGYSME-----GDEEVYSLEKLIANFDPKRIGKSGAVFDVRKLDWMNKHYLNHEG---QFIEQFKQQ-GYLPEALLNFLALLGWHP-----QYNQEFFNLKQLIENFSLSRVVSAPAFFDIKKLQWINANYIKQ-L---QFIEQFKQE-GYLPEAVVNFLALLGWNS-----DFNREFFTINQLIESFTVNRVVGAPAFFDIKKLQWINAHYIKE-L---QFIHEYKEE-GYNSQAIFNFLTLLGWTD-----EKARELMDHDEIIKSFLYTRLSKSPSKFDITKMQWFSKQYWKN-T---TSLDWYKAE-GFLPEALRNYLCLMGFSMP-----DGREIFTLEEFIQAFTWERVSLGGPVFDLEKLRWMNGKYIREVL---TSVGQYKEM-GYLPQAMVNYLALLGWGD-----GTENEFFTIDDLVEKFTIDRVNKSGAVFDATKLKWMNGQHLRS-L---TSVGQFRDM-GYLPQAMVNYLALLGWGD-----GTENEFFTLEQLVEKFTIERVNKSGAIFDSTKLRWMNGQHLRS-L---TSVEAFRDM-GYLPEALVNYLALLGWSH-----PEGKELLTLEELISSFSLDRLSPNPAIFDPQKLKWMNGYYLRN-M---SSLNLYRER-GFLPEGLLNYLSLLGWSLS-----ADQDIFTIEEMVAAFDVSDVQPNPARFDLKKCEAINGDHIRL-L---QFIEQYREL-VSCQKPCSTSSSLLGWSP-----VGESEIFSKREFIKQFDPARLSKSPAAFDQKKLDWVNNQYMKT-A---QFIEQYKEL-GYLPEALFNFIGLLGWSP-----VGEEELFTKEQFIEIFDVNRLSKSPALFDMHKLKWVNNQYVKK-L---QFVSQYKEL-GYLPEAMFNFFALLGWSP-----EGEEEIFSKDELIRIFDVSRLSKSPSMFDTKKLTWMNNQYIKK-L---SNLFAHRDR-GFIPEGLLNYLALLGWAIA-----DDHDLFSLDEMVAAFDVVDVNSNPARFDQKKADAVNAEHIRM-L---TALRQFIED-GYLPEAIINYVTLLGWSYD-----DKREFFSKNDLEQFFSIEKINKSPAIFDYHKLDFFNSYYIRE-K---VSVMQYRDD-GYLPEALINYLVRLGWGH------GDQEIFSREEMINYFELDHVSKSASAFNTEKLQWLNQHYIRE-L---VSVMQYRDD-GYLPEALLNYLVRLGWSH------GDQEIFTREEMIKYFTLNAVSKSASAFNTDKLLWLNHHYINA-L

-TAPRAMVVLHPLKVVITNLEAKSA-IEVDAKKWPDAQADDASAFYKIPFSN--VVYIERSDFR-MQDSKDYYGLAPGKSVILRYA-FPIKCTEVILADDN--ETILEIRAEYDP--------SKKTKPKGVLHWVSQPSP-GVDPLKVEVRLFERLFLSEN----PAELDNWLGDLNPHSKVEISNAYGVSLLKDAKLGDRFQFERLGYFAVDQ---------DSTPEKLVFNRTVTLKD-TAPRAMAVLESLRVIITNFPAAKS-LDIQVPNFPADETK---GFHQVPFAP--IVFIERTDFK-EEPEPGFKRLAWGQPVGLRHT-GYVIELQHVVKGPS--GCVESLEVTCRRA-------DAGEKPKAFIHWVSQ------PLMC-EVRLYERLFQHKNPEDPTEVPGGFLSDLNLASLHVVDAALVDCSVALAKPFDKFQFERLGYFSVDPD--------SHQGKLVFNRTVTLKED-NAPRAMAVIDPVKLVIENYQGEG--EMVTMPNHPNKPEM---GSRQVPFSG--EIWIDRADFR-EEANKQYKRLVLGKEVRLRNA--YVIKAERVEKDAE--GNITTIFCTYDADTLSKDP-ADGRKVKGVIHWVSAA-----HALPVEIRLYDRLFSVPN----PGAADDFLSVINPESLVIK-QGFAEPSLKDAVAGKAFQFEREGYFCLDSR--------HSTAEKPVFNRTVGLRD-NAPRAMAVIDPVRVVIENFESE---AVLTAPNHPNRPEL---GERQLPFTK--ELYIDRADFR-EEANKQYKRLVLGKEVRLRNA--YVIKAERVEKDAN--GEITTIFCTYDPETLGKNP-ADGRKVKGVIHWVSAV-----NNHPAEFRLYDRLFTVPN----PGAEDDIESVLNPNSLVIK-QGFVEQSLANAEAEKGYQFEREGYFCADSK--------DSRPEHLVFNLTVSLKE-DARRFFFVWNPKKLIIEGAEKKV----LKLRMHPDRPEF---GERELIFDG--EVYVVGDELEE--------------NKMYRLMELFNIVVEKVDDIA----LAKYHSDDFKI---------ARKNKAKIIHWIPVK-----DSVKVKVLMPDGEIK---------------------------EGFAEKDFAKVEVDDIIQFERFGFVRIDKK--------DNDGFVCCYAHR------EARRYFFAADPVKLEVVGLPGPV---RVERPLHPDHPEI---GNRVLELRG--EVYLPGDDLGE---------------GPLRLIDAVNVIYSGG--------ELRYHSEGIEE---------ARELGASMIHWVPAE-----SALEAEVIMPDASRV---------------------------RGVIEADASELEVDDVVQLERFGFARLDS---------AGPGMVFYYAHK------EARRYFFAADPVRFEIEGLPGPI---RVERSLHPDKPEL---GNRILELNG--DVYLPRGDLRE---------------GPLRLIDAVNVIYSDG--------ELRYHSEGIEE---------ARELQAAMIHWVPAE-----SALKAVVVMPDASEI---------------------------EGVIEGDASELEVDDVVQLERFGFARVDS---------SGERLVFYYAHK------IANRYFFVADPIPMEVEGAPEFI----AEIPLHPDHPER---GVRRLKFTPERPVYVSKDDLNLLK-----------PGNFVRLKDLFNVEILEVGDKI----RARFYSFEYEI---------AKKNRWKMVHWVTE-------GRPCEVIIPEGDELVV------------------------RKGLLEKD-AKVQVNEIVQFERFGFVRIDRI--------EGDKVIAIYAHK------IANRYFFVADPVPMEVEGAPEFI----AKIPLHPDHPER---GTRELRFTPGKPIYVSKDDLDLLK-----------PGSFVRLKDLFNVEIVEVGEKI----KAKFHSFEYEI---------ARKNKWRMIHWVPE-------GRPCEVIIPEGDELIV------------------------RKGLLEKD-ANVKAGEIVQFERFGFVRIDKI--------EGEKVVAIYAHK------KANRYFFIWGPVKIEIVNLPEKK---EVELPLNPHTGE-----KRRLKGER--TIYVTKDDFERLK------------GQVVRLKDFCNVLLDEK---------AEFMGFELEG---------VKK-GKNIIHWLPE------SEAIKGKVIGERE----------------------------AEGLVERN-AVRDVGKVVQFERFAFCKVES---------ADEELVAVYTHP------RADRIMYVEDPVEMEVELAQVEC--RAAEIPFHPSRPQR----KRRITLCTGDKVLLTREDAVE--------------GRQLRLMGLSNFTVSQG--------ILREVDPSLEY---------ARRMKLPIVQWVKKG-----GEASVEVLEPVELELRRH------------------------QGYAEDAIRGYGVDSRLQFVRYGFVRVDSV--------EDGVYRVIYTHK------VAKRIMFVKDAEEFSVELPESLN----AKIPLIPSKQEM----NRTIIVNPGDKILIESNDAED--------------NSILRLMELCNVKVDKHNR------KLIFHSKTLDE---------AKKVNAKIVQWVKSN-----EKVPVMVEKAERDEIKMI------------------------NGYAEKIAADLEIDEIVQFYRFGFVRVDRK--------DENMLRVVFSHD------VAPRYVALLKKEVIPVNVPEAQE--EMKEVAKHPKNPEV---GLKPVWYSP--KVFIEGADAETFSE-----------GEMVTFINWGNLNITKIHKNADGKIISLDAKFNLENK--------DYKKTT-KVTWLAETT--HALPIPVICVTYEHLITKPV----LGKDEDFKQYVNKNSKHEE-LMLGDPCLKDLKKGDIIQLQRRGFFICDQPYEPVSPYSCKEAPCVLIYIPDGHTK-VAPRYTALDSTSPLVSIELTDSISDDTSNVSLHPKNAEI---GSKDVHKGK--KLLLEQVDAAALKE-----------GEIVTFVNWGNIKIGKIEK-KGAVITKISATLQLDNT--------DYKKTT-KVTWLGDVKAEAGKTIPVVTADYDHIISKAI----IGKDEDWKQFINFDSVHYT-KMVGEPAIKNVKKGDIIQIQRKGFYIVDQPYNPKSELSGVETPLLLIAIPDGHTG-IAPRYTALEKEKRVIVNVAGAKV--ERIQVSVHPKDESL---GKKTVLLGP--RIYIDYVDAEALKE-----------GENATFINWGNILIKKVNKDASGNITSVDAALNLENK--------DFKKTL-KLTWLAVEDD-PSAYPPTFCVYFDNIISKAV----LGKDEDFKQFIGHKTRDEV-PMLGDPELKKCKKGDIIQLQRRGFFKVDVAYLPPSGYTNVPSPIVLFSIPDGHTK-VAPRHTAVESGDVVKATIVNGPAAPYAEDRPRHKKNPEL---GNKKSIFAN--EILIEQADAQSFKQ-----------DEEVTLMDWGNAYVREINRDASGKVTSLKLELHLDG---------DFKKTEKKVTWLADTE----DKTPVDLVDFDYLITKDK----LEEGENYKDFLTPQTEFHS-PVFADVGIKNLKKGDIIQVERKGYYIVDVP--------FDGTQAVLFNIPDGKTV-IAPRHTAIVNPVKIHLEGSEAPQEPKIEMKPKHKKNPAV---GEKKVIYYK--DIVVDKDDADVINV-----------DEEVTLMDWGNVIITKKNDDGS-----MVAKLNLEG---------DFKKTKHKLTWLADTK----DVVPVDLVDFDHLITKDR----LEEDESFEDFLTPQTEFHT-DAIADLNVKDMKIGDIIQFERKGYYRLDAL-------PKDGKPYVFFTIPDGKSV-VCPRHTAVVAERRVLFTLTDGPDEPFVRMIPKHKKFEGA---GEKATTFTK--SIWLEEADASAISV-----------GEEVTLMDWGNAIVKEITKDEEGRVTALSGVLNLQG---------SVKTTKLKLTWLPDTN----ELVNLTLTEFDYLITKKK----LEDDDEVADFVNPNTKKET-LALGDSNMRNLKCGDVIQLERKGYFRCDVP------FVKSSKPIVLFSIPDGRAARILNPSTLRELVDDIMPSLESIYNTSTISREKVAKILLNCGGSLSRINDF HDEFYYFFEKPKYN DNDAVTKFLSKNESRHIA HLLKKLGQFQEG TDAQEVESMVETMYYEN GFSRKVTYQAMRFALA GCHPGAKIAAMIDILG IKESNKRLSEGLQFLQREKKRILNPSTLRELVDDIMPSLESIYNTSTISREKVAKILLNCGGSLSRINDF---HDEFYYFFEKPKYN-----------DNDAVTKFLSKNESRHIA--------HLLKKLGQFQEG------TDAQEVESMVETMYYEN-----GFSRKVTYQAMRFALA-------------------------------GCHPGAKIAAMIDILG-IKESNKRLSEGLQFLQREKK-------------QSV-EEFAREVQKWALNP------------EYLMKIAPHVQGRVENFSQIAP-LAGFFFSGGVPLDASLF--------EHKKLDPTQVRQVLQLVL--------WKLESLRQWE-----------KERITGCIQAVAEH----LQLKLRDVM-PLMFPAIT------------------------------GHASSVSVLDAMEILG-ADLSRYRLRQALELLGGASKKETKEWEKIRDAIKLSEEEFAARVLAWAMDN------------ERLKEGLKLSQTRISKLGELPD-LAAFLFKSDLGLQPAAF--------AGVKASPEEMLKILNTVQ--------PDLEKILEWN-----------KDSIETELR-ASER----MGKKLKAVVAPLFVACS-------------------------------GSQRSLPLFDSMELLG-RSVVRQRLKVAAQVVASMAGSGKQ---------EGSPENLLARLKDWLVND------------EFLLKILPLCQSRMATLAEFVG-LSEFFFSVLPEYSKEEL--------LPAAISQEKAAILFYSYV--------KYLEKTDLWV-----------KDQFYLGSKWLSEA----FQVHHKKVVIPLLYVAIT------------------------------GKKQGLPLFDSMELLG-KPRTRMRMVHAQNLLGGVPKKIQTAIDKVLKEE-LTDNAYFNFIDNYLDVKVDYLK-------DKNREISLLFKNQITHGVQINE-LIRESFATKIGVENLA---------KKSHILFKNIKLFLEQLA--------KSLQGLEEWK-----------AEQIKTTINKVGAV----FNLKGKQLFMPIRLIFT-------------------------------NKEHGPDLAHIIEIFD-KESAINLIKQFINATNLF----------------LSDNAYFNFIDNYLTIDFDYLK-------NKRKEVSLLFKNQLAFGIEINQ-LIKETFAPKLGVQHLS---------VKHRELFKELQSALQQLS--------EQLQALPDWT-----------KDNVKSTLTQIGEQ----FNLKGKKLFMPLRLIFT-------------------------------NKEHGPDLAGIMVLHG-KTQVLALLQEFIHATNLF----------------TPNEELIKILNLNDYDN------------DWINLFLDLYKENIYSLNQLKN-YLKIYKQANLNQ-------------EKDLDLNDAEKNVVKSFS--------SYIDYS-NFS-----------VNQIQEAINKTQEK----LSIKGKNLFLPIRKATT-------------------------------FQEHGPELAKAIYLFG-SEIIEKRMKKWK---------------------VLSLEEVAERVKPFLREAGLSWESE-----AYLRRAVELMRPRFDTLKEFPE-KARYLFTEDYPVS------------EKAQRKLEEGLPLLKELY--------PRLRAQEEWT-----------EAALEALLRGFAAE----KGVKLGQVAQPLRAALT-------------------------------GSLETPGLFEILALLG-KERALRRLERALA---------------------LPSDLLIKDFEDQWRSTGILLESES----GFAKEAAELLKEGIDLITDADAALCKLLSYPLHETLSSD---------EAKSVVEDKLSEVASGLI--------SAYDSG-ELD--------QALAEGHDGWKKWVKSFGKT-HKRKGKSLFMPLRVLLT-------------------------------GKLHGPAMDSTVILVH-KAGTSGAVAPQSGFVSLDERFKILKEVNWESLQ-LPSEELNRIIGERWKDAGIATESQG----IFIQDAVLLLKDGIDLITDSEKALSSLLSYPLYETLASA---------EGKPILEDGVSEVAKSLL--------AAYDSG-ELS--------GALAEGQPGWQKWAKNFGKL-LKRKGKSLFMPLRVLLT-------------------------------GKLHGPDIGATTVLLY-KAGTSGSVVPQAGFVTFDERFKILREVQWESFS-MPIEKLAELAKPFFEKAGIKIIDE-----EYFKKVLEITKERVEVLSEFPE-ESRFFFEDP-----------------APVEIPEEMKEVFSQLK--------EELQNV-RWT-----------MEEITPVFKKVLKQ----HGVKPKEFYMTLRRVLT-------------------------------GREEGPELVNIIPLLG-KEIFLRRIERSLGG--------------------LEVKDFTERCRPWLKA-PVAPWAPEDFDEAKWQAIAPHAQTRLKVLSEITD-NVDFLFLPEPVFDEA----------SWTKAMKEGSDALLTTAR--------EKLD-AADWTS----------PEALKEAVLAAGEA----HGLKLGKAQAPVRVAVT-------------------------------GRTVGLPLFESLEVLG-KEKALARIDAALARLAA-----------------ADRDELLDLALHNLQEAGLVEANPAPGKMEWVRQLVNMYANQMSYTKQIVD-LSKIFFTEAKYLTDE----------EVEEIKKDEARPAIEEFK--------KQLDKLDNFT-----------AKKIMGAIMATRRE----TGIKGRKLFMPIRIATT-------------------------------RSMVGPGIGEAMELMG-KDTVMKHLDLTLKQLSEAGIE-------------LDLDQVVELTLPHLQKAGKVGTELSAEEQEWVRKLISLYHEQLSYGAEIVE-LTDLFFTDEIEYNQE----------AKAVLEEEQVPEVLSTFA--------AKLEELEEFT-----------PDNIKASIKAVQKE----TGHKGKKLFMPIRVAVT-------------------------------GQTHGPELPQSIELIG-KETAIQRLKNI-----------------------LDLDRLVELALPHLVKAGRLPADMSDEQRQWARDLIALYQEQMSYGAEIVP-LSELFFKEEVEYEDE----------ARQVLAEEQVPDVLSAFL--------AHVRDLDPFT-----------ADEIKAAIKAVQKA----TGQKGKKLFMPIRAAVT-------------------------------GQTHGPELPFAIQLLG-KQKVIERLERALQEKF------------------LDSEDFAGRLRDYFTTHGYHIALDPANYEAGFVAAAQLVQTRIVVLGDAWD-LLKFLNDDEYSIDSK----------AAAKELDADAGPVLDVAC--------AVLDSLVDWT-----------TASIEDVLKVALIE---GLGLKPRKVFGPIRVAAT-------------------------------GALVSPPLFESLELLG-RARSLQRLSAARARVTSA----------------KKDEDLFNLLLPFFQKKGYVSKPSTLEENQKLKLLIPLIKSRIKKLSDALN-MTKFFYEDIKSWNLDEF--------LSRKKTAKEVCSILELIK--------PILEGFEKRS-----------SEENDKIFYDFAES----NGFKLGEILLPIRIAAL-------------------------------GSKVSPPLFDSLKLIG-KSKVFERIKLAQEFLRINE---------------LPPEYVAKHLEWHYKDQGIDTSNG-----PALTEIVTMLAERCKTLKEMAR-SSRYFFEEFETFDEA----------AAKKHFKGNAAEALAKVK--------EKLTALSSWD-----------LHSIHEAIEQTAAE----LEVGMGKVGMPLRVAVT-------------------------------GSGQSPSMDVTLVGIG-RDRVLARIQRAIDFIHAQNA--------------LPPEYVATHLQWHIEQENIDTRNG-----PQLADLVKLLGERCKTLKEMAQ-SCRYFYEDFAEFDAD----------AAKKHLRPVARQPLEVVR--------DKLAAITDWT-----------AENVHHAIQATADE----LEVGMGKVGMPLRVAVT-------------------------------GAGQSPALDVTVHAIG-KTRSIERINKALDFIAERENQQ-----------

1.0

0.5

-----------MLRFAPSPTGDMHIGNLRAAIFNYIVAKQQYKPFLIRIEDTDK--ERNIEGKDQEILEILKLMGISWDKL----------VYQSHNIDYHREMAEKLLKENKAFYCYASAEFLEREKEKAKNEK--RPFRYSDEWATLEKDK---------------HHAPVVRLKAP-NHAVSFNDAIKKEVKFEPDELD------SFVLLRQD---------------KSPTYNFACACDDLLYKISLIIRGEDHVSNTPKQILIQQALGSND----PIVYAHLPIILDEVSGKKMSKRDEA---------------MKLTGFLKQNVRVRFAPSPTGHLHIGGLRTAFFNYLFAKKYGGDFILRIEDTDR--TRFIY-------SSLNFYNLLPDEGPREGGK-FGPYEQSKRLEIYRNAAYRLIDSGHAYRCFCSENRLDLLRKTAEKRG--EIPKYDRKCANLSSRDAVKMEQN--------GEKFVIRFKLD-KQNVQFHDEVFGSVNQFIDES-------DPVLLKSD---------------GFPTYHLANVIDDRKMEISHVIRGMEWLSSTGKHTILYKAFNWT-----PPKFVHLSLIMRSA-TKKLSKRDKD-----------------------MTVRVRIAPSPTGNLHIGTARTAVFNWLFARHTGGTFILRVEDTDL--ERSKAEYTENIQSGLQWLGLNWDEG---------PFFQTQRLDHYRKAIQQLLDQGLAYRCYCTSEELEQMREAQKAKN--QAPRYDNRHRNLTPDQEQALRAE--------GRQPVIRFRIDDDRQIVWQDQIRGQVVWQGSDLG-----GDMVIARAS--------ENPEEAFGQPLYNLAVVVDDIDMAITHVIRGEDHIANTAKQILLYEALGGA-----VPTFAHTPLILNQE-GKKLSKRDGV----------------------MSKVKTRFAPSPTGYLHLGNARTAIFSYLFARHNNGGFVLRIEDTDP--ERSKKEYEEMLIEDLKWLGIDWDEF----------YRQSERFDIYREYVNKLLESGHAYPCFCTPEELEKEREEARKKG--IPYRYSGKCRHLTPEEVEKFKKE--------GKPFAIRFKVPENRTVVFEDLIKGHIAINTDDFG------DFVIVRSD---------------GSPTYNFVVVVDDALMGITHVIRGEDHIPNTPKQILIYEALGFP-----VPKFAHLPVILGED-RSKLSKRHGA----------------------MSLIVTRFAPSPTGYLHIGGLRTAIFNYLFARANQGKFFLRIEDTDL--SRNSIEAANAIIEAFKWVGLEYDG---------EILYQSKRFEIYKEYIQKLLDEDKAYYCYMSKEELDALREEQKARK--ETPRYDNRYRDFKGTPPK-------------GIEPVVRIKVPQNEVIGFNDGVKGEVKVNTNELD------DFIIARSD---------------GTPTYNFVVTIDDALMGITDVIRGDDHLSNTPKQIVLYKALNFK-----IPNFFHVPMILNEE-GQKLSKRHGA----------------------MTNIITRFAPSPTGFLHIGSARTALFNYLFARHNNGKFFLRIEDTDK--KRSTKEAVEAIFSGLKWLGLNWDG---------EVIFQSKRNSLYKEAALKLLKEGKAYYCFTRQEEIAKQRQQALKDK--QHFIFNSEWRDKGPSTYPADIK------------PVIRLKVPREGSITIHDTLQGEIVIENSHID------DMILIRTD---------------GTATYMLAVIVDDHDMGITHIIRGDDHLTNAARQIAIYHAFGYE-----VPNMTHIPLIHGAD-GTKLSKRHGA-----------------MPAASDKPVVTRFAPSPTGYLHIGGGRTALFNWLYARGRKGTFLLRIEDTDR--ERSTPEATDAILRGLTWLGLDWDG---------EVVSQFARKDRHAEVAREMLERGAAYKCFSTQEEIEAFRESARAEG--RSTLFRSPWRDADPTSHPDA-------------PFVIRMKAPRSGETVIEDEVQGTVRFQNETLD------DMVVLRSD---------------GTPTYMLAVVVDDHDMGVTHVIRGDDHLNNAARQTMVYEAMGWE-----VPVWAHIPLIHGPD-GKKLSKRHGA----------------------MTKVITRFAPSPTGMLHVGNIRVALLNWLYAKKHNGKFILRFDDTDL--ERSKQKYKNDIERDLKFLNINWDQ----------TFNQLSRVSRYHEIKNLLINKKRLYACYETKEELELKRKLQLSKG--LPPIYDRASLNLTEKQIQKYIEQ--------GRKPHYRFFLSYE-PISWFDMIKGEIKYDGKTLS------DPIVIRAD---------------GSMTYMLCSVIDDIDYDITHIIRGEDHVSNTAIQIQMFEALNKI-----PPVFAHLSLIINKE--EKISKRVGG-----------------------MSVAVPFAPSPTGLLHVGNVRLALVNWLFARKAGGNFLVRLDDTDE--ERSKPEYAEGIERDLTWLGLTWDR----------FARESDRYGATDEVAAALKASGRLYPCYETPEELNLKRASLSSQG--RPPIYDRAALRLGDADRARLEAE--------GRKPHWRFKLEHT-PVEWTDLVRGPVHFEGSALS------DPVLIAED---------------GRPLYTLTSVVDDADLAITHVIRGEDHLANTAVQIQIFEAVGGA-----VPVFAHLPLLTDAT-GQGLSKRLGS---------------

1gln

---SSVKWLLNQ-GFLPVAIANYLITIGN-------KVPKEVFSLDEAIEWFSLENLSSSPAHFNLKYLKHLNHEHLKL-L---AFVSYYSEQLGALPEAVLNLMIRNGAGIRN---FDAEHFYSLDEMIEQFDLSLLGRRNLLLDSDVLQKYSRMAFQK-S---TSIDDFRAM-GFLPQAIANYMCLLGWTPP----DSTQEIFTLAEAAEQFSLERVNKAGAKFDWQKLDWINSQYLHA-L---VSVRAYREE-GYMPEALFNYLCLLGWSPP----EEGREIFSKEELIKIFDLKDVNDSPAVFNKEKLKWMNGVYIREVL---TNVMDYQEM-GYLKEALVNFLARLGWSY------QDKEVFSMQELLELFDPKDLNSSPSCFSWHKLNWLNAHYLKN-Q---LGVEAYKDM-GYLPESLCNYLLRLGWSH------GDDEIISMNQAIEWFNLASLGKSPSKLDFAKMNSINSHYLRM-L---LGVEEYQAM-GYPAAGMRNYLARLGWSH------GDDEFFTSEQAMDWFDLGGIGRSPARLDFKKLESVCGQHIAV-M---FEIAYLKKEVGLEAMTIASFFSLLGSSLH-----IF-PYKSIEKLVAQFEISSFSKSPTIYQQYDLERLNHKLLIS-L---LSVASLREEEGIEPMALASLLAKLGTSDA-----IE-PRLTLDELVAEFDIAKVSRATPKFDPEELLRLNARILHL-L

-LDDDKLLELTSIKD---------------KNLLGLLRLFIEECGTLLELRE-KISLFLEPKD----------------IVKTYENEDFKERCLAL--------FNALTSMDFQA----------YKDFESFKKEAMRL----SQLKGKDFFKPLRILLT-------------------------------GNSHGVELPLIFPYIQSHHQEVLRLKA------------------------SDFKELYPRIIDILNKKSNYSTSREDI--QKIVTFLKAKEENFGFLSSLST-EFSWFFTRPQ---------------SSQLLKESHPNVDLRNIL--------NSLLEIEVFN-----------SESLEYLAKNH--------QLNLAKAMGIVRISLI-------------------------------GSKKGPPISELVEFFG-MTECHRRI----RIMQELL---------------LPAAELVPLLIPHLEAGGHQVDPDRDQ--AWLVGLATLIGPSLTRLTDAAT-ESQLLFGDRLELKED----------GQKQLAVEGAKAVLEAAL--------TFSQNTPELT-----------LDEAKGEINRLTKE----LGLKKGVVMKSLRAGLM-------------------------------GTVQGPDLLQSWLLLQQKGWATTRLTQAIAAE------------------VLPLDVLLERAIPFLEKAG--YDTSDR---EYIKKVLEYTRDSFDTLSEMVD-RLRPFFVDEFEIPEE----------LWSFLDDEKAYQVLSAFL--------EKIREKKPET-----------PQEVKKLAKEIQKA----LKVKPPQVWKPLRIALT-------------------------------GELEGVGIDILIAVLP-KEKIEKRILRVLEKLS------------------QSVQELLKLLKPFSFSDLSHLNP------TQLDRLLDALKERSQTLKELAL-KIDEVLIAPVEYEEK----------VFKKLNQALVMPLLEKFK--------LELNKANFND-----------ESALENAMRQIIEE----EKIKAGSFMQPLRLALL-------------------------------GKGGGIGLKEALFILG-KTESVKRIEDFLKN--------------------LDNDSLTSKTVEILKQNYKISEKEV----SYIKQAMPSLIVRSETLRDLAQ-LAYIYLVDSPMIYSQ----------DAKEVINNCDKDLIKQVI--------ENLSKLEQFN-----------KECVQNKFKEIAIY----NGLKLNDIMKPVRALIT-------------------------------GMTASPSVFEIAETLG-KENILKRLKIIYYNNLNF----------------MEDAELMREIAAYLAAARKPALTDLQA--ARLEKGLYALKDRAKTFPELLE-KARFALESRPIVADD----------AAAKALDPVSRGILRELT--------P-MLQAASWS-----------KQDLEAILTAFASE----KGMGFGKLAAPLRTALA-------------------------------GRTVTPSVYDMMLVIG-RDETIARLEDAAAA--------------------LDFNEVKERLKEIDAD-------------YIDENFWLSVR---PNLQKLSD-IKDWWDICYQTPKIKNLN-------LDKEYLKQASKLLP-LKI--------TKDSWSIWT-------------KEITNIT-----------GRKGKELFLPLRLALT-------------------------------GRESGPEIAGILPLID-REEIIRRLISIA----------------------LPFERVAGELAASVWM-------------MPTPAFWEAV----PNLSRVAE-ARDWWAVTHAP--VARRR-------TIPLFLAEAATLLPKEPW--------DLSTWGTWT-------------GAVKAKT-----------GRKGKDLFLPLRRALT-------------------------------GRDHGGQLKNLLPLIG-RTRAHKRLAGETA--------------------

Window length = 8

Window length = 40

PipeAlign http://bips.u-strasbg.fr/PipeAlign/

Construction de l’alignement

LMS (local maximum segments)LMS (local maximum segments)

BlastP search Ballast Anchors DbClustal AlignmentQuery Sequence

Anchors

g m

Amélioration de l’alignementRASCAL: rapid scanning and correction of multiple sequence alignments.

ASM2O. Lecompte – IGBMC Plewniak et al. (2003) Nucleic Acids Res.

p g p q gThompson et al., Bioinformatics 2003

LEON: multiple aLignment Evaluation Of Neighbours.Thompson et al., NAR 2004

Clustering Secator Wicker et al. Mol. Biol. Evol. 2001

DPCWicker et al., NAR 2002

Page 3: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

3

ASM2O. Lecompte – IGBMC

Alignement multiple

Méthodes utilisées Méthodes utilisées optimal progressif itératif combinaison d’approches

Estimation de la qualité d’un alignementUtilisation de l’alignement m ltiple

ASM2O. Lecompte – IGBMC

Utilisation de l’alignement multiple validation/correction des séquences protéiques

Page 4: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

4

B

Codon initiateur

MRIRLEHGAGGEL

ARCHAEA

ACTERIA

B

ASM2O. Lecompte – IGBMCHydrogenase expression/formation proteins (Nter)

ARCHAEA

ACTERIA

Exemple: Protéine du complexe de transcription TFIIH

Borne intron/exon

ASM2O. Lecompte – IGBMC

Page 5: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

5

Exemple d’erreur de splicing dans un récepteur stéroïdien de Caenorhabditis elegans SW : NH35_CAEEL SW : NH35_CAEEL après correction du splicing

http://bips.u-strasbg.fr/vALId/

Validation of protein sequence quality based on multiple ALIgnment data

ASM2O. Lecompte – IGBMC

Bianchetti et al.J. Bioinform. Comput. Biol 2005

Page 6: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

6

délétioninsertion région validée

par un ARMm

ASM2O. Lecompte – IGBMC

Utilisation de l’alignement multiple

lid i / i d é éi validation/correction des séquences protéiques organisation en domaines

importance pour la prédiction de fonction, de structure…

ASM2O. Lecompte – IGBMC

Page 7: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

7

Ordered Alignment Analysis of TyrRSOrdered Alignment Analysis of TyrRS

EucArcEuc

Bac

Motif II

Euc

ArcEuc

Bac

Motif I

Euc

Bac

Motif II

EMAP domainN-terminal extension C-terminal extensionS4 domain

10 aa

Utilisation de l’alignement multiple

validation/correction des séquences protéiques validation/correction des séquences protéiques organisation en domaines

importance pour la prédiction de fonction, de structure…

conservation au sein de la famille résidus conservés dans toutes les séquences=> importance structurale ou fonctionnelle : motif caractéristique

é id é é ifi t d

ASM2O. Lecompte – IGBMC

résidus conservés spécifiquement dans un sous-groupe => résidus discriminants

Page 8: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

8

Ordered Alignment Analysis of TyrRSOrdered Alignment Analysis of TyrRS

EucArcEuc

Bac

Motif II

Euc

ArcEuc

Bac

Motif I

Euc

Bac

Motif II

EMAP domainN-terminal extension C-terminal extensionS4 domain

10 aa

Analyse des conservations

Page 9: alignement multiple.ppt [Mode de compatibilité]lecompte/cours/alignement_multiple2.pdf · 2010-11-16 · 1 Alignement multiple MéthodesutiliséesMéthodes utilisées optimal progressif

9

Utilisation de l’alignement multiple

validation/correction des séquences protéiques validation/correction des séquences protéiques organisation en domaines

importance pour la prédiction de fonction, de structure…

conservation au sein de la famille résidus conservés dans toutes les séquences=> importance structurale ou fonctionnelle : motif caractéristique

é id é é ifi t d

ASM2O. Lecompte – IGBMC

résidus conservés spécifiquement dans un sous-groupe => résidus discriminants

construction de motifs et de profils