Post on 06-May-2022
UMR168, Institut Curie, Section RechercheDynamique de l’ARN et Systèmes Biomoléculaires
Hervé ISAMBERT
Les Réseaux Biologiqueset leur Evolution
Dynamique de l’ARN et Systèmes Biomoléculaires
Duplicationof genes
Deletionof Links
Iteration
4
2 1
4’4
3
2’ 21
3
Evolution des réseaux biologiques
5’ 3’
A
GCAGG
AGGACG
A
A
GCAUCCUCCUACG
AAGGUAGGGGGAUGG
AAA
AC C
GUCC
CCCUGC
C
A
AG
C
AGGAGGA
CG
AAGCA
UCCUCCU
ACGA
AGGUAGGGGGAUGG
AAA
ACCGUCCCCCUGCC
A
5’ 3’
cis / transcontrol
Autoassemblage ARNCommutateurs ARN
100 nm
I. Switches et Réseaux ARN
II. Réseaux Biologiques
Hervé ISAMBERT, UMR168, Institut Curie
PPI Networks: Evlampiev & Isambert, BMC Systems Biology 1:49 (2007)
Hervé Isambert CNRS UMR168, Institut CurieRNA dynamics and Biomolecular Systems Lab
Tree of Life(from tolweb site)
Marco Cosentino Lagomarsino
Kirill Evlampiev
Lagomarsino, Jona, Bassetti & Isambert,
Evlampiev & Isambert, arxiv.org/abs/q−bio.MN/0611070
Transcription Networks:
Evolution and Topology of Large Biological Networks
PNAS, 104, 5516−5520 (2007)
I. Intro: Des Génomes aux Réseaux
III. Propriétés des Réseaux Biologiques
II. Différents Types de Réseaux Biologiques
IV.Evolution des Réseaux: des Gènes aux Organismes
V. Modèles d’Evolution des Réseaux
Les Réseaux Biologiques et leur Evolution
Hervé ISAMBERT, Institut Curie
Les Réseaux Biologiques et leur Evolution
Hervé ISAMBERT, Institut Curie
III. Propriétés des Réseaux Biologiques
II. Différents Types de Réseaux Biologiques
IV.Evolution des Réseaux: des Gènes aux Organismes
V. Modèles d’Evolution des Réseaux
I. Intro: Des Génomes aux Réseaux
within some species familiesEven multimodal distribution
5 decades (8 including viruses)
Genome size distributionwhat was known before sequencing
Sparrow 1976
Genome Sequencing Statistics
NCBI, 21 Jan 2008
Human 20,000−25,000
Arabidopsis 29,000
Gallus 20,000−23,000
Paramecium 39,000
Wheat 75,000 ?(allohexaploid)
6,000
19,000
13,000
Ten Years of Genome Sequencing:
Tetraodon 22,000
1 cell
100,000 cells
1,000 cells
Evolution through Duplication−Divergence of Genes and Genomes
few genes... same genes...
Biological Networks
protein1 protein2
interaction
Combinatorial Gene Expression
Detailed interaction logic (AND/OR/NOT)
Combinatorics
Network−like schematic representations :
without logic !!
Les Réseaux Biologiques et leur Evolution
Hervé ISAMBERT, Institut Curie
III. Propriétés des Réseaux Biologiques
IV.Evolution des Réseaux: des Gènes aux Organismes
V. Modèles d’Evolution des Réseaux
I. Intro: Des Génomes aux Réseaux
II. Différents Types de Réseaux Biologiques
Biological Networks
Les Réseaux Biologiques et leur Evolution
Hervé ISAMBERT, Institut Curie
I. Intro: Des Génomes aux Réseaux
II. Différents Types de Réseaux Biologiques
IV.Evolution des Réseaux: des Gènes aux Organismes
V. Modèles d’Evolution des Réseaux
III. Propriétés des Réseaux Biologiques
Topology of Biological Networks
Yeast Protein−Protein Interaction Network
Yeast Protein−Protein Interaction Network
Yeast Protein−Protein Interaction Network
Exponential versus Scale−free Network Topology
Modularity of Protein Interaction Networks
1 10 100
0,001
0,01
0,1
1
pk,in
pk,out
IN and OUT Degree distributions
B Feed Forward Loop
Bi−FanC
A Autoregulatory Loop
�� ����
���� ����
�� ��
��
����
"Over−represented" Motifs
Shen−Orr (2002) andRegulonDB (2005)
Structures of Regulatory Networks
Coherent and Incoherent Feedforward Loops
Speed up genetic response
Mangan et al, JMB 2006
BUT... such motifs are NOT conserved!
Propriétes Dynamiques des Modules de transcription
Low
High
Repressor
Repressor
Low
High
Activator
Activator
Demand Rule in Gene Regulation
Savageau PNAS (1977)
Demand Rule in Gene Regulation
Savageau PNAS (1977)
HighRepressor
ActivatorRepressor Low
Activator
Low
High
Very Preliminary Conclusion
Biological Networks are NOT Random Graphs !
but is it for FUNCTIONAL reason ??
or is it because they CANNOT be randomby CONSTRUCTION ??
but is it for FUNCTIONAL reason ??
or is it because they CANNOT be randomby CONSTRUCTION ??
Then what sense does it make to compare them to randomized graphs ??
Very Preliminary Conclusion
Biological Networks are NOT Random Graphs !
... except in the light of evolution""Nothing in biology makes sense...
but is it for FUNCTIONAL reason ??
or is it because they CANNOT be randomby CONSTRUCTION ??
Then what sense does it make to compare them to randomized graphs ??
Very Preliminary Conclusion
Theodosius Dobzhansky (1973)
Biological Networks are NOT Random Graphs !
Les Réseaux Biologiques et leur Evolution
Hervé ISAMBERT, Institut Curie
I. Intro: Des Génomes aux Réseaux
III. Propriétés des Réseaux Biologiques
II. Différents Types de Réseaux Biologiques
V. Modèles d’Evolution des Réseaux
IV.Evolution des Réseaux: des Gènes aux Organismes
The Ribosome
The Ribosome
34 Euk − Arc only Homologous Ribosomal Proteins 33 Euk − Arc − Bac Universal Ribosomal Proteins
The Ribosome
Ribosomal Protein S8 in Archaea and Bacteria
Universal RibosomalProteins bwn Eukaryotes,
Archaea and Bacteria
Remaining HomologousProteins bwn Eukaryotesand Archaea
Homologies between RNA Polymerases
ArchaeaBacteria
Eukaryote (RNAPol I, II & III)
TATA Box Asymmetry and RNAPol II and III Direction
But... TATA Binding Protein (TBP)is essentially symmetric
TBP / BRF (Pol III) EukaryoteTBP / TFIIB (Pol II) Eukaryote
TBP / TFB Archaea
Transcription Direction given by TFIIB (TFB) and BRF
BidirectionalTranscription inGiardia lamblia
Teodorovic et al NAR 2007
BRF but no TFIIB ??
Independent Aquatic Placental Mammals
Hippopotame
Tapir / Rhinoceros
Seals
Mechanisms of Genome Evolution
Long suspected... recently proved!
shuffling of protein domains
Implies important genetic modifications!!
Mechanisms of Genome Evolution
shuffling of protein domains
Kellis et al. 2004Whole Genome Duplication in Yeast Genome
Kellis et al. 2004Whole Genome Duplication in Yeast Genome
−20−30
arch
eaba
cter
ia>1
0,00
0,00
0 ??
∗−2,500 / 4,000 ??
choa
nofla
gella
tes
bird
s 1
0,00
0
rept
iles
8,0
00di
nosa
urs
dino
saur
s,
m
amm
als
5,5
00x 2x 2
jelly
fish,
mol
lusc
a, s
pide
rs,
crab
s,in
sect
s, 9
00,0
00
flow
erin
g27
5,00
0
x 2
x 2
S. cerevisiaeK. lactis
−150
−250
−300
−350
−450
−500
viru
ses
eukaryotes
−1,000
Paramecium
x 2
othe
rs (
prot
ists
) 30
,000
MYA
x 2
x 2
A. thalianaP. trichocarpa
tuni
cate
s 5
00
bony fish 24,000
x 2
Ciona
x 2
Carp
X. laevis
H. sapiens
x 2
vertebrates 52,000
amph
ibia
6,
000
tetrapods 28,000
chordates
x 2
N. viridulusNadis B. napus
x 2
othe
r 5
5,00
0x 2
x 2
plan
ts
330
,000
x 2
animals 1,200,000
TetraodonX. tropicalisT. barreraefu
ngi
100
,000
x 2
Whole Genome Duplications promote:
Whole Genome Duplications in Evolution
−Speciation events−Shuffling of protein domains
−20−30
arch
eaba
cter
ia>1
0,00
0,00
0 ??
∗−2,500 / 4,000 ??
choa
nofla
gella
tes
bird
s 1
0,00
0
rept
iles
8,0
00di
nosa
urs
dino
saur
s,
m
amm
als
5,5
00x 2x 2
jelly
fish,
mol
lusc
a, s
pide
rs,
crab
s,in
sect
s, 9
00,0
00
flow
erin
g27
5,00
0
x 2
x 2
S. cerevisiaeK. lactis
−150
−250
−300
−350
−450
−500
viru
ses
eukaryotes
−1,000
Paramecium
x 2
othe
rs (
prot
ists
) 30
,000
MYA
x 2
x 2
A. thalianaP. trichocarpa
tuni
cate
s 5
00
bony fish 24,000
x 2
Ciona
x 2
Carp
X. laevis
H. sapiens
x 2
vertebrates 52,000
amph
ibia
6,
000
tetrapods 28,000
chordates
x 2
N. viridulusNadis B. napus
x 2
othe
r 5
5,00
0x 2
x 2
plan
ts
330
,000
x 2
animals 1,200,000
TetraodonX. tropicalisT. barreraefu
ngi
100
,000
x 2
Whole Genome Duplications promote:
Whole Genome Duplications in Evolution
−Speciation events−Shuffling of protein domains
Evolution by Duplication Divergence
Evolution by Duplication Divergence
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istri
butio
n
Number of shared partners
Nod
e p
air
dist
ribut
ion
with
sha
red
par
tner
s
WGD dupli > 20 x random pairsWGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI networkYeast 6000 genes : 4000 in PPI network
PPI network
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istri
butio
n
Number of shared partners
Nod
e p
air
dist
ribut
ion
with
sha
red
par
tner
s
WGD dupli > 20 x random pairsWGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI networkYeast 6000 genes : 4000 in PPI network
PPI network
NetworkDuplicationunder WGD
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istri
butio
n
Number of shared partners
Nod
e p
air
dist
ribut
ion
with
sha
red
par
tner
s
WGD dupli > 20 x random pairsWGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
PPI network
Yeast 6000 genes : 4000 in PPI network
Divergence
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istri
butio
n
Number of shared partners
Nod
e p
air
dist
ribut
ion
with
sha
red
par
tner
s
WGD dupli > 20 x random pairsWGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
PPI network
Yeast 6000 genes : 4000 in PPI network
Divergence
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istri
butio
n
Number of shared partners
Nod
e p
air
dist
ribut
ion
with
sha
red
par
tner
s
WGD dupli > 20 x random pairsWGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
PPI network
Yeast 6000 genes : 4000 in PPI network
"Rewiring"
1 10
1e-06
1e-04
0.01
1Node degree
Nod
e d
egre
e d
istri
butio
n
Number of shared partners
Nod
e p
air
dist
ribut
ion
with
sha
red
par
tner
s
WGD dupli > 20 x random pairsWGD dupli > 1,000 x random pairs
k=1k=10
Proba to share k+ partners :
WGD duplirandom
random pairs
degree distributions
WGD dupli pairs
Evidence of Genome Duplication on PPI network
500 duplicates from WGD : 250 with both duplicates in available PPI network
PPI network
Yeast 6000 genes : 4000 in PPI network
SharedPartners
Les Réseaux Biologiques et leur Evolution
Hervé ISAMBERT, Institut Curie
I. Intro: Des Génomes aux Réseaux
III. Propriétés des Réseaux Biologiques
II. Différents Types de Réseaux Biologiques
IV.Evolution des Réseaux: des Gènes aux Organismes
V. Modèles d’Evolution des Réseaux
time−lineargrowth
Evolution of PPI Networks
Evlampiev, Isambert q−bio.MN/06011070
growthexponentialEvlampiev, Isambert BMC Syst. Bio. (2007)
time−lineargrowth
Evolution of PPI Networks
Evlampiev, Isambert q−bio.MN/06011070
growthexponentialEvlampiev, Isambert BMC Syst. Bio. (2007)
single
Symmetric divergence
new
dupli
Duplication
Partial Genome
interaction
proteins
n=1
General Duplication−Divergence Model of PPI Network Evolution
4’
41 3
2
2’1−q q
1 2 3 4Genome
P−P Interaction Network
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev1
3
2’ 2
4’4
1
4
2
3
General Duplication−Divergence Model of PPI Network Evolution
γon
γsn
soγ
γss
γ
γoo
nn
n=1
Duplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
1 2 3 4
K. Evlampiev
Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
1
4’4
3
2’ 21
4
2
3
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Duplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
n=1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
1
4’4
3
2’ 21
4
2
3
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Duplication
Partial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
n=1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
1
4’4
3
2’ 21
4
2
3
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Duplication
IterationPartial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
n=1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
1
4’4
3
2’ 21
4
2
3
Overview of the Evolutionary Regimes
Overview of the Evolutionary Regimes
with respect to p(k)
Overview of the Evolutionary Regimes
Overview of the Evolutionary Regimes
Overview of the Evolutionary Regimes
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Duplication
IterationPartial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
n=1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
1
4’4
3
2’ 21
4
2
3
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Γik
iioγ inγΓi
Duplication
IterationPartial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
o
ns
kq
i
1−q
i=s,o,n
= (1−q) + q( + )isγNode Degree Growth Rate
n=1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
Exponential Dynamics of Genome and PPI Network Evolution
1
4’4
3
2’ 21
4
2
3
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Γik
i
Growing NetworkVanishing Network
ioγ inγΓi
sΓ nΓoΓ
Duplication
IterationPartial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
o
ns
kq
i
1−q
i=s,o,n
Γ <1Γ >1
= (1−q) + q( + )isγ
Γ = (1−q) + q + q1
Node Degree Growth Rate
Network Link Growth Rate
n=1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
Exponential Dynamics of Genome and PPI Network Evolution
1
4’4
3
2’ 21
4
2
3
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Γik
i
Growing NetworkVanishing Network
ioγ inγΓi
sΓ nΓoΓ
sΓ oΓ= (1−q) + qMNetwork Conservation Index
Duplication
IterationPartial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
o
ns
kq
i
1−q
i=s,o,n
<1>1 Conserved Network
Nonconserved Network
MM
Γ <1Γ >1
= (1−q) + q( + )isγ
Γ = (1−q) + q + q
2
1
Node Degree Growth Rate
Network Link Growth Rate
n=1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
Exponential Dynamics of Genome and PPI Network Evolution
1
4’4
3
2’ 21
4
2
3
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Γik
i
Growing NetworkVanishing Network
ioγ inγΓi
sΓ nΓoΓ
sΓ oΓ= (1−q) + qMNetwork Conservation Index
Duplication
IterationPartial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
o
ns
kq
i
1−q
i=s,o,n
<1>1 Conserved Network
Nonconserved Network
MM
Γ <1Γ >1
= (1−q) + q( + )isγ
Γ = (1−q) + q + q
2
1
Node Degree Growth Rate
Network Link Growth Rate
n=1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
Exponential Dynamics of Genome and PPI Network Evolution
1
4’4
3
2’ 21
4
2
3
Deletion
of Links
δ=1−γ
γon
γsn
soγ
γss
γ
γoo
nn
General Duplication−Divergence Model of PPI Network Evolution
Node Degree Growth Rate
Network Conservation Index
GDD Model + Fluctuations
Network Link Growth Rate
Γik
i
Growing NetworkVanishing Network
ioγ inγΓi
sΓ oΓ= (1−q) + qM
sΓ nΓoΓ
oΓ (n)(n)
Duplication
IterationPartial Genome
new
oldsingle
interaction
proteins
Asymmetric divergence
o
ns
kq
i
1−q
i=s,o,n
<1>1 Conserved Network
Nonconserved Network
MM
Γ <1Γ >1
= (1−q) + q( + )isγ
Γ = (1−q) + q + q
R
sΓΠ (n)(n)
n=1M = [(1−q ) + q ]
2
1
1 2 3 4Genome
P−P Interaction Network
4’
41 3
2
2’1−q q
x (1+q)
GDD Model of PPI Network Evolution
K. Evlampiev
Exponential Dynamics of Genome and PPI Network Evolution
[ ]
1
4’4
3
2’ 21
4
2
3
ΓM < 1 < Γ1 < M <
1011
1
Expanding PPI Networks
Non−Conserved Networks Conserved Networks M
Γ
Γ
1
n oo
s
s
s
o
oo
oo
so
so
o
s s
n
Network Conservation (M) and Expansion ( )Γ
2
0
1
3
4
5
67
89
duplication
Vanishing
M < < 1
ΓM < 1 < Γ1 < M <
1011
1
Expanding PPI Networks
Non−Conserved Networks Conserved Networks M
Γ
Γ
1
n oo
s
s
s
o
oo
oo
so
so
o
s s
n
Network Conservation (M) and Expansion ( )Γ
2
0
1
3
4
5
67
89
duplication
Vanishing
M < < 1
ΓM < 1 < Γ1 < M <
1011
1
Expanding PPI Networks
Non−Conserved Networks Conserved Networks M
Γ
Γ
1
n o
s
s
s
o
oo
oo
s
Network Conservation (M) and Expansion ( )Γ
2
0
1
3
4
5
67
89
duplication
Vanishing
M < < 1
ΓM < 1 < Γ1 < M <
1011
1
Expanding PPI Networks
Non−Conserved Networks Conserved Networks M
Γ
Γ
1
Network Conservation (M) and Expansion ( )Γ
2
0
1
3
4
5
67
89
duplication
Vanishing
M < < 1
<N >/<N >=(n)(n+1) ∆(n) h( )=α∆ = +q Γ(1−q) +q Γα α α
Γs o n
Asymptotic "Characteristic Equation"Network Node Growth Rate
<1{α}0<
αh( )
p ~ 1/k α+1k
Γ
Γ
i
i
h( )=α +q Γ(1−q) +q Γα α α
Γs o n
α∗>1
<1{α}0<
Asymptotic Topology of PPI Networks
α10
h(1)
Dense
h(1)
Exponential Degree Distribution
h(0)
α∗>1
Scale−free Degree DistributionM’=max( ) > 1
kp ~ exp(− k)M’=max( ) < 1 µ
Convex Characteristic Function
<N >/<N >=(n)(n+1) ∆(n) h( )=α∆ = +q Γ(1−q) +q Γα α α
Γs o n
Asymptotic "Characteristic Equation"Network Node Growth Rate
<1{α}0<
αh( )
p ~ 1/k α+1k
Γ
Γ
i
i
h( )=α +q Γ(1−q) +q Γα α α
Γs o n
α∗>1
<1{α}0<
oΓiΓ (n)
Conserved Networks are also Scale−freenecessary (M’ > M > 1)
GDD Model + Fluctuations = max( )Πn=1
R R
sΓΠ (n)(n)
n=1
(n)(n)> [(1−q ) + q ] =M’ M
Scale−free
Exponential
[
Asymptotic Topology of PPI Networks
α10
h(1)
Dense
h(1)
Exponential Degree Distribution
h(0)
α∗>1
Scale−free Degree DistributionM’=max( ) > 1
kp ~ exp(− k)M’=max( ) < 1 µ
Convex Characteristic Function
= max( )iΓi=s,o,n
M’Asymptotic Network TopologyM’ >1M’ <1
3
]
γ ss
soγ
γsn
γoo γon
γnn
of Linkswith proba
Deletion
1−γ1−µ
proteins
k k+1
Duplication−Divergence Models Including Self−links
interaction
µnµs
µonµo
Iteration
Duplication
Partial Genome
Self−links leads to time−linear (not exponential) connectivity expansionk k Γ
Self−links do not affect evolutionary regimes butare essential contributors to certain network motifs
x (1+q)
1 2’ 21 2
3 3
4
4’
41
While Proteins are typically ConservedNetwork Motifs are NOT Conserved under
(if M>1)
Duplication−Divergence Evolution
Conservation of Network Motifs
M1
γγ
γ
γγ
γ
γγγ 23
Motif conservation index
s/o
s/o
s/o
This is in broad agreement with empirical observations !
Kirill Evlampiev
0.01 0.1 10
0.4
0.8
1.2
0.01 0.1 10
0.4
0.8
1.2
0.01 0.1 10
0.4
0.8
1.2
L/<L> or N/<N> L/<L> or N/<N> L/<L> or N/<N>
20% Growth Rate 50% Growth Rate ~80% Growth Rate
Linear regime N ~ L NL regime N < L
Distribution of Network Sizes
1 1.5 2 2.5 3-1
-0.5
0
0.5
1
networks
Conservedscale−free
interaction
protein
interactionsproteins
oΓ
oΓPhase Diagram of Asymptotic Networks
networks
networks
Conservedscale−free
exponential Dense networksNon−conserved
Network growth rate
Div
erge
nce
asy
mm
etry
Stationary Non−stationary
proteinIteration
DuplicationGenome
InteractionDeletion
newγ
γold
γcross
Protein Network Evolution under Duplication
+
234
o−
Γ
Divergenceasymmetry
Networkeffective growth
Γn
Γ n
Γn
Asymmetric Divergence of Duplicates is RequiredWhole Genome Duplication Limit (q=1)
PPI Network Evolution in terms of protein domains
x 2x 4
1
4
2
1
3’3
4’4
22’
1 1.5 2 2.5 3-1
-0.5
0
0.5
1
networks
Conservedscale−free
interaction
protein
interactionsproteins
oΓ
oΓPhase Diagram of Asymptotic Networks
networks
networks
Conservedscale−free
exponential Dense networksNon−conserved
Network growth rate
Div
erge
nce
asy
mm
etry
Stationary Non−stationary
proteinIteration
DuplicationGenome
InteractionDeletion
newγ
γold
γcross
Protein Network Evolution under Duplication
+
234
o−
ΓOrigin of Divergence Asymmetry ?
"Spontaneous Symmetry Breaking" between duplicated binding sites
Divergenceasymmetry
Networkeffective growth
Γn
Γ n
Γn
Asymmetric Divergence of Duplicates is RequiredWhole Genome Duplication Limit (q=1)
PPI Network Evolution in terms of protein domains
New
Old
Protein
x 2x 4
Protein binding partner 1
Protein binding partner 2
Protein binding partner 3
1
4
2
1
3’3
4’4
22’
1 1.5 2 2.5 3-1
-0.5
0
0.5
1
networks
Conservedscale−free
interaction
protein
interactionsproteins
oΓ
oΓPhase Diagram of Asymptotic Networks
networks
networks
Conservedscale−free
exponential Dense networksNon−conserved
Network growth rate
Div
erge
nce
asy
mm
etry
Stationary Non−stationary
proteinIteration
DuplicationGenome
InteractionDeletion
newγ
γold
γcross
Protein Network Evolution under Duplication
+
234
o−
ΓOrigin of Divergence Asymmetry ?
"Spontaneous Symmetry Breaking" between duplicated binding sites
Divergenceasymmetry
Networkeffective growth
Γn
Γ n
Γn
Asymmetric Divergence of Duplicates is RequiredWhole Genome Duplication Limit (q=1)
PPI Network Evolution in terms of protein domains
New
Old
Gene Duplicates
x 2x 4
Protein binding partner 1
Protein binding partner 2
Protein binding partner 3
1
4
2
1
3’3
4’4
22’
1 1.5 2 2.5 3-1
-0.5
0
0.5
1
networks
Conservedscale−free
interaction
protein
interactionsproteins
oΓ
oΓPhase Diagram of Asymptotic Networks
networks
networks
Conservedscale−free
exponential Dense networksNon−conserved
Network growth rate
Div
erge
nce
asy
mm
etry
Stationary Non−stationary
proteinIteration
DuplicationGenome
InteractionDeletion
newγ
γold
γcross
Protein Network Evolution under Duplication
+
234
o−
ΓOrigin of Divergence Asymmetry ?
"Spontaneous Symmetry Breaking" between duplicated binding sites
Divergenceasymmetry
Networkeffective growth
Γn
Γ n
Γn
Old
New
Asymmetric Divergence of Duplicates is RequiredWhole Genome Duplication Limit (q=1)
PPI Network Evolution in terms of protein domains
x 2x 4
Protein binding partner 1
Protein binding partner 2
Protein binding partner 3
1
4
2
1
3’3
4’4
22’
1 1.5 2 2.5 3-1
-0.5
0
0.5
1
networks
Conservedscale−free
interaction
protein
interactionsproteins
oΓ
oΓPhase Diagram of Asymptotic Networks
networks
networks
Conservedscale−free
exponential Dense networksNon−conserved
Network growth rate
Div
erge
nce
asy
mm
etry
Stationary Non−stationary
proteinIteration
DuplicationGenome
InteractionDeletion
newγ
γold
γcross
Protein Network Evolution under Duplication
+
234
o−
ΓOrigin of Divergence Asymmetry ?
"Spontaneous Symmetry Breaking" between duplicated binding sites
PPI Network Evolution in terms of protein domains
Divergenceasymmetry
Networkeffective growth
Γn
Γ n
Γn
Old
New
Asymmetric Divergence of Duplicates is RequiredWhole Genome Duplication Limit (q=1)
x 2x 4
Protein binding partner 1
Protein binding partner 2
Protein binding partner 3
1
4
2
1
3’3
4’4
22’
Genome and PPI Network Evolutionunder Protein Domain Shuffling
Domain−Domain Interaction Network
Direct interaction bwn protein domains
Multidomain proteins (random shuffling)
domain1
domain2
Redefining PPI Networks as Domain Interaction Networks
XOR
XOR
protein1=domain1
Genome and PPI Network Evolutionunder Protein Domain Shuffling
Domain−Domain Interaction Network
Direct interaction bwn protein domains
Multidomain proteins (random shuffling)
protein2=
domain2A + domain2B
Redefining PPI Networks as Domain Interaction Networks
XOR
XOR
protein1=domain1
Genome and PPI Network Evolutionunder Protein Domain Shuffling
Domain−Domain Interaction Network
Indirect interaction within complexes
Direct interaction bwn protein domains
Multidomain proteins (random shuffling)
Adding some "combinatorial logic" through multidomain proteins
Redefining PPI Networks as Domain Interaction Networks
protein2=
AND
domain2A + domain2BXOR
XOR
1 10 1001e-6
1e-4
0.01
1
kpkk 2 k.g
kp
(+/− 2σ)WGD Model
(+/− 2σ)
Yeast direct int. data
Yeast direct int. dataWGD Model
γon γoo=0.1 ( =1 γnn )=0γon
Node Degree
Protein direct connectivity
Average Connectivityof Neighbours k.g
kDistribution
λ=1.5 prot. bind. domains per protein
Comparison with Yeast Direct Interaction Data
Topology of direct interaction network
A two−parameter Whole Genome Duplication Modelwith Random Protein Domain Shuffling λ
1 10 1001e-6
1e-4
0.01
1
kpkk 2 k.g
kp
(+/− 2σ)WGD Model
(+/− 2σ)
Yeast direct int. data
Yeast direct int. dataWGD Model
γon γoo=0.1 ( =1 γnn )=0γon
Γ = Γ + Γ = 1 + 2 = 20%γono n
PPI Network growth rate
Seasquirt−Human (2WGD): 25%Seasquirt−Fugu (3WGD): 11%
Node Degree
Protein direct connectivity
Average Connectivityof Neighbours k.g
kDistribution
λ=1.5 prot. bind. domains per protein
Comparison with Yeast Direct Interaction Data
Topology of direct interaction network
A two−parameter Whole Genome Duplication Modelwith Random Protein Domain Shuffling λ
1 10 1001e-6
1e-4
0.01
1
1 10 1001e-6
1e-4
0.01
1
kpkk 2 k.g
kp
kk 2 k.g
kp
(+/− 2σ)WGD Model
(+/− 2σ)
Yeast direct int. data
Yeast direct int. dataWGD Model
(+/− 2σ)WGD Model
γon γoo=0.1 ( =1 γnn )=0γonwith Random Protein Domain Shuffling λ
A two−parameter Whole Genome Duplication Model
Topology of direct interaction network Topology of direct & indirect interaction network
Comparison with Yeast Direct + Indirect Interaction Data
Node Degree
Protein direct connectivity
Average Connectivityof Neighbours
Direct & indirect "matrix" connectivity
k.gk
Distribution
(+/− 2σ)WGD Model
Yeast dir. & indir. int. data
Yeast dir. & indir. int. data
λ=1.5 prot. bind. domains per protein
1 10 1001e-6
1e-4
0.01
1
1 10 1001e-3
0.01
0.1
1
1 10 1001e-3
0.01
0.1
1
kp
γnn
k.gk
Node Degree Distribution Average Connectivity of Neighbours
Fit of Yeast Data with a one−parameter WGD Model γ(onγ oo=1 )=0=0.26
kp
Protein direct connectivity
kk 2 k.g
Protein direct connectivity
(+/− 2σ)
phys. interaction dataYeast two−hybrid and
WGD Model
Comparison with Yeast Direct Interaction Data
Yeast two−hybrid and
WGD Model (+/− 2σ)
phys. interaction data
Hierarchy and Feedback in the Evolution
of Bacterial Transcription Networks
Cosentino−Lagomarsino, Jona, Bassetti & Isambert,PNAS, 104, 5516 (2007) arxiv.org/abs/q.bio/0701035
Transcription in Bacteria
Regulatory Interactions
?
Operons : 648TFs : 147ARs : 85 (58% TFs)Reg Int : 1170
Operons : 423TFs : 117ARs : 59 (50% TFs)Reg Int : 578
Shen−Orr Dataset (2002)
RegulonDB 5.5 (2006)
0 10 20 30 40 50 6000.010.020.030.040.050.06 Randomized
E.colifnr
arcA thrS_infC_rpmI_rplT_pheMST_ihfA
gadAX
gadW
dusB_fis
hns marRAB
idnDOTR
gntRKU
rob
Hierarchy with Mostly Self−Regulatory Feedback
No Feedback Core
Feedback Core SizesSmall Feedback Core
0 5 10 15 20 25 300
0.05
0.1
0.15
0.2
E. coli
RandomizedE.coli
0 5 10 15 20 25 300
0.05
0.1
0.15
0.2
E. coli
RandomizedE.coli
0 10 20 30 400
0.05
0.1
0.15
0.2RandomizedE.coli
0 5 10 15 20 250
0.05
0.1
0.15
0.2 RandomizedE.coli
RegulonDB 5.5 (2006)
0 10 20 30 400
0.05
0.1
0.15 RandomizedSubtilis
0 10 20 30 400
0.05
0.1
RandomizedSubtilis
Escherichia coliShen Orr et al (2002)
Num
ber
of L
ayer
sou
tsid
e c
ore
Long
est S
horte
st P
aths
Hierarchy with few Transcriptional Layers
Bacillus subtilis
Transcription Factor Families
Babu et al., NAR 2006
Babu et al., NAR 2006
TF Family Expansion by Duplication
A’a’pAap A’ap
Aa’p
Evolutionary Decoupling of Duplicated ARs
AR Duplication
Crosstalks+
Crosstalks+
A’a’pAap A’ap
Aa’p
Evolutionary Decoupling of Duplicated ARs
AR Duplication
Crosstalks+
rpoE_rseABC rpoH
crp cytR dnaA
lysR
lysA
tdcABCDEFG
crp fnr himA tdcAR
rpoE_rseABC (rpsU_dnaG_)rpoD
lexA_dinF
cspA
caiF flhDC fliAZY nhaA osmC rcsAB
stpA
lrp
hns
exuR
exuT uxaCA
uxuABR
uidRABC
A’a’pAap A’ap
Aa’p
A few partial decoupling
Evolutionary Decoupling of Duplicated ARs
AR Duplication
Crosstalks+
crp fnrC
GAT
C
GAT
GCTA
GCAT
CGAT
CGAT
CAGT
CTAG
GCAT
GCAT
CGAT
G
C
AT
C
G
TA
GCTA
GCAT
GATC
GTCA
TGCA
CGTA
CGAT
GCAT
-1 -0.5 0 0.5 1specificity
00.5
11.5
22.5
33.5
norm
aliz
ed h
its
autoregulatory sites
CRP
-1 -0.5 0 0.5 1specificity
0
1
2
3
4
norm
aliz
ed h
its
autoregulatory sites
FNR
CGTA
CGTA
CGAT
CGAT
CAGT
CAGT
ACGT
CTAG
GCTA
C
GAT
C
T
A
T
CTGA
A
CGAT
AGCT
GTAC
GTCA
GTAC
GCTA
GCAT
GCAT
rpoE_rseABC rpoH
crp cytR dnaA
lysR
lysA
tdcABCDEFG
crp fnr himA tdcAR
rpoE_rseABC (rpsU_dnaG_)rpoD
lexA_dinF
cspA
caiF flhDC fliAZY nhaA osmC rcsAB
stpA
lrp
hns
exuR
exuT uxaCA
uxuABR
uidRABC
A’a’pAap A’ap
Aa’p
Most fully decoupledA few partial decoupling
Evolutionary Decoupling of Duplicated ARs
crp / fnr example
AR Duplication
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino, Jona, Bassetti & Isambert, PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
> 50% of TF ~ 10% of TF
Long
est S
horte
st P
aths
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino, Jona, Bassetti & Isambert, PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
Long
est S
horte
st P
aths
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino, Jona, Bassetti & Isambert, PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
Long
est S
horte
st P
aths
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino, Jona, Bassetti & Isambert, PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
Adaptative Selection ? Random drift ?
Long
est S
horte
st P
aths
Bacteria versus Eukaryote Transcriptional Hierarchy
Cosentino−Lagomarsino, Jona, Bassetti & Isambert, PNAS 2007 Sellerio et al., submitted 2008
S. cerevisiae has long transcriptional cascades
while bacteria have short ones
S. cerevisiaeEscherichia coli Bacillus subtilis
Adaptative Selection ? Random drift ?
PopulationDynamics
Long
est S
horte
st P
aths