Genome organisation

35
Genome organization In Eukaryotes In Eukaryotes Arun Viswanathan II nd Sem, M.Sc.BMB © Arun Viswanathan

Transcript of Genome organisation

Page 1: Genome organisation

Genome organizationIn EukaryotesIn Eukaryotes

Arun ViswanathanIInd Sem, M.Sc.BMB

© Arun Viswanathan

Page 2: Genome organisation

40 km wire in a tennis ball !• Each cell has approximately 2meters of DNA• Nucleus is only about 6µm in diameter• In eukaryotes DNA occurs as highly condensed form during

cell division as Chromosomes.• 3.2x 109 nucleotides is packed into 24 different

chromosomes• DNA is highly negative in charge. How it is possibly wind

over another without repulsion?• How can genomic processes like replication and

transcription is possible in such tightly winded structures?• the Disentanglement time for the transition from

interphase to metaphase chromosomes of size 100 Mb is inthe order of 500 years

• Each cell has approximately 2meters of DNA• Nucleus is only about 6µm in diameter• In eukaryotes DNA occurs as highly condensed form during

cell division as Chromosomes.• 3.2x 109 nucleotides is packed into 24 different

chromosomes• DNA is highly negative in charge. How it is possibly wind

over another without repulsion?• How can genomic processes like replication and

transcription is possible in such tightly winded structures?• the Disentanglement time for the transition from

interphase to metaphase chromosomes of size 100 Mb is inthe order of 500 years

Page 3: Genome organisation

Genome organization

Page 4: Genome organisation

Chemical composition of chromatin

• DNA (20-40%)most important chemicalconstituent of chromatin

• RNA (05-10%)associated with chromatin as;rRNA, mRNA, tRNA

• Proteins (55-60%)Histones: very basic proteins,constitute about 60% of totalprotein, almost 1:1 ratio withDNA.Five Types: H1, H2a, H2b, H3and H4

• Non-Histones: They are 20%of total chromatin protein:

• Nucleosomal AssemblyProteins (NAP), Other Histonechaperones Chromosomeremodeling complexes

• Structural (actin, L & B tubulin& myosin) contractileproteins, function duringchromosome condensation &in movement of chromosomes.

• all enzymes and cofactors –involved in replication,transcription and itsregulation

• DNA (20-40%)most important chemicalconstituent of chromatin

• RNA (05-10%)associated with chromatin as;rRNA, mRNA, tRNA

• Proteins (55-60%)Histones: very basic proteins,constitute about 60% of totalprotein, almost 1:1 ratio withDNA.Five Types: H1, H2a, H2b, H3and H4

• Non-Histones: They are 20%of total chromatin protein:

• Nucleosomal AssemblyProteins (NAP), Other Histonechaperones Chromosomeremodeling complexes

• Structural (actin, L & B tubulin& myosin) contractileproteins, function duringchromosome condensation &in movement of chromosomes.

• all enzymes and cofactors –involved in replication,transcription and itsregulation

Page 5: Genome organisation

Beads on a String• Beads on a string structure is the primary level of DNA packaging• They are often called as 11nm fibre• The diameter of “beads” is 11nm• The beads are made of proteins called as Histones

© Lehninger Principle of Biochemistry, Michael M.Cox, David L. Nelson, Fifth Edition W.H. Freeman And Company, New York

Page 6: Genome organisation

Beads on a String• Histone are highly basic (+ve charged),• Rich in basic amino acids Arginine and Lysine• Five Major class: H1, H2A, H2B, H3, H4• Amino acid sequence of H3 and H4 are highly conserved• Histones and DNA along with NAP form a condensed structure calledNucleosome. It is the fundamental structural unit of chromatin.• The highly basic nature of Histones, aside from facilitating DNA Histoneinteractions, contributes to their water solubility.• H1 is present in half the amount of the other four histones.

• Histone are highly basic (+ve charged),• Rich in basic amino acids Arginine and Lysine• Five Major class: H1, H2A, H2B, H3, H4• Amino acid sequence of H3 and H4 are highly conserved• Histones and DNA along with NAP form a condensed structure calledNucleosome. It is the fundamental structural unit of chromatin.• The highly basic nature of Histones, aside from facilitating DNA Histoneinteractions, contributes to their water solubility.• H1 is present in half the amount of the other four histones.

Content of basic amino acids

Histones MolecularWeight

Number ofAA residue Lys % Arg % Total %

H1 21,130 223 29.5 11.3 40.8H2A 13,960 129 10.9 19.3 30.2H2B 13,774 125 16 16.4 32.4H3 15,273 135 19.6 13.3 32.9H4 11,236 102 10.8 13.7 24.5

Page 7: Genome organisation

Histones

A 147bp segment of DNA then wraps around the histone octamer 1.65 times. EachNucleosome particle are separated from each other by a linker DNA, which can be offewer nucleotides up to about 80. The term nucleosome refers to a nucleosome coreparticle plus an adjacent linker DNA. On an average, nucleosome repeat at intervals ofabout 200 nucleotides.

A diploid Human cell contains about 30 million nucleotides !!

Page 8: Genome organisation

Nucleosomal Assembly

Histones are predominantly basic proteins but also contain hydrophobic andacidic patches. They repel each other at physiological pH and form non-nucleosomal aggregates with DNA. Histone chaperones prevent thesenonspecific interactions and can direct the productive assembly anddisassembly of nucleosomes by facilitating histone deposition and exchange.

Page 9: Genome organisation

Histone-DNA interactions

1. Electrostatic Interactions:Helix-dipoles form αhelixes in H2B, H3, and H4cause a net +ve charge toaccumulate at the point ofinteraction with -velycharged phosphate groupson DNA

2. Hydrogen bonds: betweenthe DNA backbone andthe amide group on themain chain of Histoneproteins

3. Non-polar interactions:between the Histonesand sugars on DNA

4. Salt bridges and hydrogenbonds: between side chainsof basic AA(especially lys and arg) &phosphate oxygens on DNA

5. Non-specific minor grooveinsertions: of the H3 andH2B N-terminal tails intotwo minor grooves each onthe DNA molecule

1. Electrostatic Interactions:Helix-dipoles form αhelixes in H2B, H3, and H4cause a net +ve charge toaccumulate at the point ofinteraction with -velycharged phosphate groupson DNA

2. Hydrogen bonds: betweenthe DNA backbone andthe amide group on themain chain of Histoneproteins

3. Non-polar interactions:between the Histonesand sugars on DNA

4. Salt bridges and hydrogenbonds: between side chainsof basic AA(especially lys and arg) &phosphate oxygens on DNA

5. Non-specific minor grooveinsertions: of the H3 andH2B N-terminal tails intotwo minor grooves each onthe DNA molecule

Page 10: Genome organisation

Histone tail, Histone code & Epigenetics

• There are eight N-terminal domain/Tail domain in histone core.• These tail domains are heavily modified.•These modifications include:

acetylationmethylation ubiquitylation phosphorylation

sumoylation ribosylation citrullination

• There are eight N-terminal domain/Tail domain in histone core.• These tail domains are heavily modified.•These modifications include:

acetylationmethylation ubiquitylation phosphorylation

sumoylation ribosylation citrullination

The idea that multiple dynamic modifications regulate gene transcription in asystematic and reproducible way is called the histone code and is heritable.Mechanisms of heritability of histone state are not well understood. However it ispredicted that it must be working same as DNA methylation; a histone previouslymodified may possess a inherent tendency to get modify as previous. This is one ofthe way how epigenetics works

Page 11: Genome organisation

The 30nm fiber• With the help of H1 the 11nm fiber compress to form

more compact 30nm fiber. H1 primarily is in contact with15-20bp of linker DNA and helps in contracting linkerDNA. H1 histone is often called as ‘linker histone’

• There exist different models to explain the structure of30nm fiber. Solenoid model and Zig-Zag model are twomain models.

• However recent studies demonstrates intermediate 30nm fibers contain both the solenoid and zigzagconformations, suggesting instead that observationsmade in in vitro experiments might be an isolationartifact due to strictly cationic low-salt environment orchemical cross-linking (e.g., glutaraldehyde fixation).

• With the help of H1 the 11nm fiber compress to formmore compact 30nm fiber. H1 primarily is in contact with15-20bp of linker DNA and helps in contracting linkerDNA. H1 histone is often called as ‘linker histone’

• There exist different models to explain the structure of30nm fiber. Solenoid model and Zig-Zag model are twomain models.

• However recent studies demonstrates intermediate 30nm fibers contain both the solenoid and zigzagconformations, suggesting instead that observationsmade in in vitro experiments might be an isolationartifact due to strictly cationic low-salt environment orchemical cross-linking (e.g., glutaraldehyde fixation).

Page 12: Genome organisation

The 30nm fiber•In the one-start solenoidmodel, bent linker DNAsequentially connects eachnucleosome cores, creatinga structure wherenucleosomes follow eachother along the samehelical path. Thenucleosomes follows achronological numberingpattern. (viz. 1,2,3…)

•It is uncertain whether H1promotes a solenoid fiber.

•In the one-start solenoidmodel, bent linker DNAsequentially connects eachnucleosome cores, creatinga structure wherenucleosomes follow eachother along the samehelical path. Thenucleosomes follows achronological numberingpattern. (viz. 1,2,3…)

•It is uncertain whether H1promotes a solenoid fiber.

Page 13: Genome organisation

The 30nm fiber

In the two-start zigzagmodel, straight linker DNAconnects two opposingnucleosome cores, creatingthe opposing rows ofnucleosomes that form socalled “two-start” helix.In zigzag model, alternatenucleosomes becomeinteracting partners. (Viz.1,3,2,4…)

In the two-start zigzagmodel, straight linker DNAconnects two opposingnucleosome cores, creatingthe opposing rows ofnucleosomes that form socalled “two-start” helix.In zigzag model, alternatenucleosomes becomeinteracting partners. (Viz.1,3,2,4…)

Page 14: Genome organisation

‘One-start’ Helix(Solenoid)

Page 15: Genome organisation

‘Two-start’ Helix(ZigZag)

Page 16: Genome organisation

Intermediate 30 nm fibers

Four proposed structures of the 30 nm chromatin filament for DNArepeat length per nucleosomes ranging from 177 to 207 bp.Linker DNA in yellow and nucleosomal DNA in pink

Page 17: Genome organisation

Higher chromatin organizations(Metaphase Chromosome)

• We know very less about higher chromosomallevels of genome organization

• However in Histone genes it is shown that the30nm fiber supercoils itself into six loopsattached to a protein called nuclear scaffold(NS).

• Even though the actual composition of the NS isnot known it is shown that Topo II is a majorcomponent and is needed for the attachment ofsupercoiled 30nm fiber to the NS.

• Several cancer chemotheraputic drugs, which areTopo II inhibitors allows strand breakage throughthis mechanism.

• More hierarchies are also proposed.

• We know very less about higher chromosomallevels of genome organization

• However in Histone genes it is shown that the30nm fiber supercoils itself into six loopsattached to a protein called nuclear scaffold(NS).

• Even though the actual composition of the NS isnot known it is shown that Topo II is a majorcomponent and is needed for the attachment ofsupercoiled 30nm fiber to the NS.

• Several cancer chemotheraputic drugs, which areTopo II inhibitors allows strand breakage throughthis mechanism.

• More hierarchies are also proposed.

Page 18: Genome organisation

Higher chromatin organizations(Metaphase Chromosome)

Page 19: Genome organisation

Higher chromatin organizations(Metaphase Chromosome)

Higher chromatin organizations(Metaphase Chromosome)

Page 20: Genome organisation

Higher chromatin organizations(Interphase Chromosome)

• Determining how the Interphase chromosome is packedwas a great deal to biologist. Since all the visualtechnologies failed to create an image of chromosomeat interphase nucleus so that it explains its nature.

• Two main models:• chromosome territory model, proposed by Carl Rabl in

1885. According to this model, the DNA of eachchromosome occupies a defined volume of the nucleusand only overlaps with its immediate neighbors

• "spaghetti" model, the DNA fiber of multiplechromosomes meanders through the nucleus in alargely random fashion, and the chromosomes aretherefore intermingled and entangled with each other

• Determining how the Interphase chromosome is packedwas a great deal to biologist. Since all the visualtechnologies failed to create an image of chromosomeat interphase nucleus so that it explains its nature.

• Two main models:• chromosome territory model, proposed by Carl Rabl in

1885. According to this model, the DNA of eachchromosome occupies a defined volume of the nucleusand only overlaps with its immediate neighbors

• "spaghetti" model, the DNA fiber of multiplechromosomes meanders through the nucleus in alargely random fashion, and the chromosomes aretherefore intermingled and entangled with each other

Page 21: Genome organisation

Higher chromatin organizations(Interphase Chromosome)

• The key experiment todistinguish between twomodels was carried out inthe early 1980s by ThomasCremer, a German cellbiologist, and his physicistbrother, Christoph Cremer.

• The Cremer brothers foundexperimental evidencethat strongly supportedthe chromosometerritory model.

• The key experiment todistinguish between twomodels was carried out inthe early 1980s by ThomasCremer, a German cellbiologist, and his physicistbrother, Christoph Cremer.

• The Cremer brothers foundexperimental evidencethat strongly supportedthe chromosometerritory model.

Page 22: Genome organisation

• During interphase, each chromosome occupies a spatially limited, roughlyelliptical domain which is known as a chromosome territory (CT).

• Each CT is comprised of higher order chromatin units of ~1 Mb each.• built up from smaller loop domains.• CT are known to be arranged radially around the nucleus.• This arrangement is both cell and tissue-type specific and is also

evolutionary conserved.• The radial organization of CT was shown to correlate with their gene density

and size. The gene-rich chromosomes occupy interior positions, whereaslarger, gene-poor chromosomes, tend to be located around the periphery.

• CT are also dynamic structures, with genes able to relocate from theperiphery towards the interior once they have been “switched on”.

• CT may exist either as discrete unit without intermingling or may haveoverlapping on each other

Chromosome Territory (CT)• During interphase, each chromosome occupies a spatially limited, roughly

elliptical domain which is known as a chromosome territory (CT).• Each CT is comprised of higher order chromatin units of ~1 Mb each.• built up from smaller loop domains.• CT are known to be arranged radially around the nucleus.• This arrangement is both cell and tissue-type specific and is also

evolutionary conserved.• The radial organization of CT was shown to correlate with their gene density

and size. The gene-rich chromosomes occupy interior positions, whereaslarger, gene-poor chromosomes, tend to be located around the periphery.

• CT are also dynamic structures, with genes able to relocate from theperiphery towards the interior once they have been “switched on”.

• CT may exist either as discrete unit without intermingling or may haveoverlapping on each other

Page 23: Genome organisation

Chromosome Territory (CT)Recurrent Clusters

A) Chromosome territories (green) in liver cell nuclei (blue). B) Visualizationof multiple chromosomes reveals spatial patterns of organization.Chromosomes 12 (red), 14 (blue), and 15 (green) form a triplet cluster inmouse lymphocytes.Part A: © 2004 Parada, L. A. et al. Tissue-specific spatial organization of genomes. Genome Biology 5:R44doi:10.1186/gb-2004-5-7-r44. Part B: © 2002 Cell Press/Elsevier Inc. Parada, L. A. et al. Conservation of relativechromosome positioning in normal and cancer cells. Current Biology 12, 1692–1697 (2002).

Page 24: Genome organisation

Chromosome Territory (CT)• Large areas of chromosomal identity between differentspecies that have been maintained throughout evolution.These areas of identity maintain their positions in differentspecies (Tanabe et al., 2002).• CT can reposition in disease, which might provide novelinsights into disease mechanisms and why genes are incorrectlyexpressed in disease.• Scientists have manipulated the localization of chromosomesand seen some changes in gene expression as a result, thussuggesting a possible mechanism for the connection betweenCT and disease (Finlan et al., 2008).• No proteins have been identified that either anchorchromosomes in the nucleus or link multiple chromosomes toeach other to establish chromosome clusters.

• Large areas of chromosomal identity between differentspecies that have been maintained throughout evolution.These areas of identity maintain their positions in differentspecies (Tanabe et al., 2002).• CT can reposition in disease, which might provide novelinsights into disease mechanisms and why genes are incorrectlyexpressed in disease.• Scientists have manipulated the localization of chromosomesand seen some changes in gene expression as a result, thussuggesting a possible mechanism for the connection betweenCT and disease (Finlan et al., 2008).• No proteins have been identified that either anchorchromosomes in the nucleus or link multiple chromosomes toeach other to establish chromosome clusters.

Page 25: Genome organisation

Chromosome Territory (CT)Movement of CT

GENE OFF GENE ON

Page 26: Genome organisation

Chromosome Territory (CT)FISH of Human interphase nucleus

10µm

Page 27: Genome organisation

Other domains in nucleus• Transcription factories

– transcription is spatially organized into discernablenuclear structures in which multiple RNApolymerases and active genes dynamically localizeinto nuclear bodies termed transcription factories.

• Transcription factories– transcription is spatially organized into discernable

nuclear structures in which multiple RNApolymerases and active genes dynamically localizeinto nuclear bodies termed transcription factories.

Page 28: Genome organisation

Molecular Models of looping

• Random loop ModeloWith loops at all scales > 150bp

• Multi-loop modeloExplains 120kbp rosette Structure

• Random Walk/ Giant loop ModeloThe basic feature of the RW-GLmodel is the existence of 1-3 Mbpsize loops along a randomlyoriented backbone

• Random loop ModeloWith loops at all scales > 150bp

• Multi-loop modeloExplains 120kbp rosette Structure

• Random Walk/ Giant loop ModeloThe basic feature of the RW-GLmodel is the existence of 1-3 Mbpsize loops along a randomlyoriented backbone

Looping allows spatial closeness ofregulatory elements thus explaininghow it functions at 10s of Kbps and isdemonstrated in β-globin genes

Page 29: Genome organisation

Sequential organization

Page 30: Genome organisation

Sequential organization

Page 31: Genome organisation

Tandem repeats

Microsatellite DNA• Unit - 2-4 bp (most 2).• Repeat - on the order of 10-

100 times.• Location - Generally

euchromatic.• Examples - Most useful

marker for population levelstudies..

Minisatellite DNA• Unit - 15-400 bp (average

about 20).• Repeat - Generally 20-50 times

(1000-5000 bp long).• Location - Generally

euchromatic.• Examples - DNA fingerprints.

Tandemly repeated but oftenin dispersed clusters. Alsocalled VNTR’s (variablenumber tandem repeats).

• Tandem repeats occur in DNA when a pattern of two or more nucleotides isrepeated and the repetitions are adjacent to each other• Form different density band on density gradient centrifugation (from bulk

DNA) -satellite

• Unit - 2-4 bp (most 2).• Repeat - on the order of 10-

100 times.• Location - Generally

euchromatic.• Examples - Most useful

marker for population levelstudies..

• Unit - 15-400 bp (averageabout 20).

• Repeat - Generally 20-50 times(1000-5000 bp long).

• Location - Generallyeuchromatic.

• Examples - DNA fingerprints.Tandemly repeated but oftenin dispersed clusters. Alsocalled VNTR’s (variablenumber tandem repeats).

Page 32: Genome organisation

Interspersed Repetitive DNA• Interspersed repetitive DNA accounts for 25–40 % of mammalian DNA.• They are scattered randomly throughout the genome.• The units are 100 – 1000 base pairs long.• Copies are similar but not identical to each other.• Interspersed repetitive genes are not stably integrated in the genome; they

move from place to place.• They can sometimes mess up good genes

These are:• Retrotransposons (class I transposable elements) (copy and paste),

copy themselves to RNA and then back to DNA (using reversetranscriptase) to integrate into the genome.

• Transposons (Class II TEs) (cut and paste) uses transposases to makemakes a staggered sticky cut.

Page 33: Genome organisation

Interspersed Repetitive DNA• Retrotransposons are:

long terminal repeat (LTR) Any transposon flanked by LongTerminal Repeats. (also called retrovirus-like elements). None areactive in humans, some are mobile in mice.

long interspersed nuclear elements (LINEs) encodes RT and short interspersed nuclear elements (SINEs) uses RT from LINEs.

example Alu made up of 350 base pairs long, recognized by theRE AluI (Non-autonomous)

• Retrotransposons are: long terminal repeat (LTR) Any transposon flanked by Long

Terminal Repeats. (also called retrovirus-like elements). None areactive in humans, some are mobile in mice.

long interspersed nuclear elements (LINEs) encodes RT and short interspersed nuclear elements (SINEs) uses RT from LINEs.

example Alu made up of 350 base pairs long, recognized by theRE AluI (Non-autonomous)

Page 34: Genome organisation

Gene rich regions have beenvisualized with a fluorescent probethat hybridizes to the Aluinterspersed repeat, which ispresent in more than a millioncopies in human genome. Forunknown reasons, these sequencescluster in chromosomal regions richin genes(GREEN). In this pictureregions depleted for thesesequence are RED, while theaverage regions are YELLOW. Thegene rich regions are seen to bedepleted in the DNA near thenuclear envelope.

A. Bolzer et. al, PLoS Biol. 3:826-842, 2005

Linking sequential organization and Genome Organization

Gene rich regions have beenvisualized with a fluorescent probethat hybridizes to the Aluinterspersed repeat, which ispresent in more than a millioncopies in human genome. Forunknown reasons, these sequencescluster in chromosomal regions richin genes(GREEN). In this pictureregions depleted for thesesequence are RED, while theaverage regions are YELLOW. Thegene rich regions are seen to bedepleted in the DNA near thenuclear envelope.

A. Bolzer et. al, PLoS Biol. 3:826-842, 2005 5µm

Page 35: Genome organisation

Thank youThank you