Bio 108 Lec 4

download Bio 108 Lec 4

of 30

Transcript of Bio 108 Lec 4

  • 7/31/2019 Bio 108 Lec 4

    1/30

    The Complexity of EukaryoticGenomes

    -genomes of most eukaryotes are larger

    and more complex than those of prokaryotes

    - However, the genome size of manyeukaryotes does not appear to be

    related to genetic complexity.

    - Much of the complexity of eukaryotic

    genomes thus results from the abundance of

    several different types of noncoding

    sequences, which constitute most of the DNA

    of higher eukaryotic cells.

    Figure 4.1. Genome size The range of sizes of

    the genomes of representative groups of

    organisms are shown on a logarithmic scale.

  • 7/31/2019 Bio 108 Lec 4

    2/30

    Introns and Exons

    -some of the noncoding DNA in eukaryotes is accounted for by long DNA sequences

    that lie between genes (spacer sequences)

    -large amounts of noncoding DNA are also found within most eukaryotic genes. Such

    genes have a split structure in which segments of coding sequence (called exons)

    are separated by noncoding sequences (intervening sequences, or introns)

    Figure 4.2. The structure of eukaryotic genesThe introns are then removed by splicing to form the mature mRNA.

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886
  • 7/31/2019 Bio 108 Lec 4

    3/30

    -On average, introns are estimated to account for about ten times more DNA than exons in

    the genes of higher eukaryotes. (eg.human gene that encodes the blood clotting protein

    factor VIII. This gene spans approximately 186 kb of DNA and is divided into 26 exons. The

    mRNA is only about 9 kb long, so the gene contains introns totaling more than 175 kb).

    -introns are clearly not required for gene function in eukaryotic cells.

    -most introns have no known cellular function, although a few have been found to encode

    functional RNAs or proteins.

    -generally thought to represent remnants of sequences that were important earlier inevolution.

    -introns may have helped accelerate evolution by facilitating recombination between

    protein-coding regions (exons) of different genesa process known as exon shuffling.

    (recombination between introns of different genes would result in new

    genes containing novel combinations of protein-coding sequences)

  • 7/31/2019 Bio 108 Lec 4

    4/30

    Table 9-1. Classification of Eukaryotic DNA

    Protein-coding genes

    Solitary genesDuplicated and diverged genes (functional gene

    families and nonfunctional pseudogenes)

    Tandemly repeated genes encoding rRNA, 5S rRNA,

    tRNA, and histones

    Repetitious DNASimple-sequence DNA

    Moderately repeated DNA (mobile DNA elements)

    Transposons

    Viral retrotransposons

    Long interspersed elements (LINES; nonviral

    retrotransposons)

    Short interspersed elements (SINES; nonviral

    retrotransposons)

    Unclassified spacer DNA

  • 7/31/2019 Bio 108 Lec 4

    5/30

    Gene Families and Pseudogenes

    -another factor contributing to the large size of eukaryotic genomes is

    that some genes are repeated many times.

    -many eukaryotic genes are present in multiple copies, called genefamilies while most prokaryotic genes are represented only once in the genome

    Figure 4.5. Globin gene families

    Members of the human - and -

    globin gene families are clustered on

    chromosomes 16 and 11, respectively.

    Each family contains genes that are

    specifically expressed in embryonic,

    fetal, and adult tissues, in addition tononfunctional gene copies

    (pseudogenes).

  • 7/31/2019 Bio 108 Lec 4

    6/30

    Repetitive DNA Sequences

    -3 classes based on the number of times nucleotide sequence is

    repeated within the population of DNA fragments

    a. Highly Repeated DNA Sequences

    - sequences present in at least 105 copies per genome (1-10% of total

    DNA)

    - typically short and present in clusters in which the given sequence

    repeats itself over and over again w/out interruption

    - the sequence is said to be present in tandem (sequence arranged inend-to-end manner)

    1. Satellite DNAs

    - short sequences about 5 to a few hundred base pairs in

    length(form very large clusters each containing up to several million

    base pairs of DNA)- base composition of the said DNA segments is different

    from the bulk of the DNA that fragments contg. the sequence that

    can be separated into a distinct satellite band during density

    gradient centrifugation

    - localization of satellite DNA within centromeres of

    chromosomes

  • 7/31/2019 Bio 108 Lec 4

    7/30

    -precise function of satellite DNA remains a mystery

    2. Minisatellite DNAs

    -sequences range from about 12 to 100 base pairs in length and are

    found in clusters contg as many as 3000 repeats.

    -occupy considerably shorter stretches of the genome than the

    satellite sequences

    -tend to be unstable, and the number of copies of a particular

    sequence often increase or decrease from one generation to the next (the

    length of a particular minisatellite locus is highly variable in the population

    even among the same members of the same family

    -used to identify individuals in criminal orpaternity cases thru the technique of DNA

    fingerprinting

    Use of PCR in forensic science.

    Hypervariable microsatellite sequences (such as

    variable number of tandem

    repeats, or VNTR) are identified by PCR.

  • 7/31/2019 Bio 108 Lec 4

    8/30

    3. Microsatellite DNAs

    -shortest sequences (1 to 5 base pairs long) and are typically

    present in small clusters of about 10 to 40 base pairs in length

    -scattered quite evenly thru the DNA more than 100,000different loci are present in the human genome

    b. Moderately Repeated DNA Sequence

    - moderately repeated fraction of the genomes of plants and animals

    - vary from about 20 to more than 80% of the total DNA, depending onthe organism

    - sequences that are repeated within the genome anywhere from a few

    times to tens of thousands of times.

    - sequences that code for known gene products, either RNAs or proteins,

    and those that lack a coding function

    1. Repeated DNA Sequences with Coding Functions

    -includes genes that code for ribosomal RNA as well as those

    that code for histones (impt. chromosomal proteins)

    -the repeated sequences that code for each of these products

    are typically identical to one another and located in tandem array

  • 7/31/2019 Bio 108 Lec 4

    9/30

    2. Repeated DNAs that Lack Coding Functions

    -the bulk of the moderately repeated DNA fraction does not

    encode any type of product

    -repetitive DNA sequences are scattered throughout thegenome rather than being clustered as tandem repeats (i.e. interspersed

    thruout the genome as individual elements).

    -can be grouped into two classes:

    a. SINEs (short intesrpersed elements)

    -major SINEs in mammalian genomes are

    Alusequences, so-called because they usually contain a single

    site for the restriction endonucleaseAluI.

    -Alu sequences are approximately 300 base pairs long,

    and about a million such sequences are dispersed throughout

    the genome, accounting for nearly 10% of the total cellular

    DNA.-AlthoughAlu sequences are transcribed into RNA,

    they do not encode proteins and their function is unknown.

    b. LINEs (long interspersed elements)

    -major human LINEs (which belong to the LINE 1, or L1,

    family) are about 6000 base pairs long and repeatapproximately 50,000 times in the genome.

  • 7/31/2019 Bio 108 Lec 4

    10/30

    -L1 sequences are transcribed and at least some encode

    proteins, but likeAlu sequences, they have no known function in cell

    physiology.

    -bothAlu and L1 sequences are examples oftransposable

    elements, which are capable of moving to different sites in genomicDNA.

    - some of these sequences may help regulate gene

    expression, but mostAlu and L1 sequences appear not to make a

    useful contribution to the cell.

    -they may have played important evolutionary roles bycontributing to the generation of genetic diversity.

    3. Nonrepeated DNA Sequences

    - are present in a single copy in the genome which includes genes that exhibit

    Mendelian patterns of inheritance

    -always localize to a particular site on a particular chromosome-contains by far the greatest amount of genetic information

    -include DNA sequences that code for virtually all proteins other thanhistones

    -even if these sequenes are not present in multiple copies, genes that code for

    polypeptides are usually members of a family of related genes(this is true for globins,

    actins, myosins, collagens, tubulins, integrins and most other proteins in a eukaryoticcell.

  • 7/31/2019 Bio 108 Lec 4

    11/30

    The Number of Genes in Eukaryotic Cells

    Organism Genome size (Mb)a Protein-coding

    sequence (percent)b

    Number ofgenesb

    E. coli 4.6 90 4,288

    S. cerevisiae 12 70 5,885

    C. elegans 97 25 19,099

    Drosophila 180 13 13,600

    Human 3,000 3 *100,000

    a Mb = millions of base pairs.b Determined from theE. coli, S. cerevisiae, C. elegans, andDrosophila genomic sequences

    and estimated for the human genome.

    Table 4.1. The Numbers of Genes in Cellular Genomes

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886
  • 7/31/2019 Bio 108 Lec 4

    12/30

    *Nature431, 931-945 (21 October 2004) | doi:10.1038/nature03001Finishing theeuchromatic sequence of the human genome

    Latest information:

    -2.85 billion nucleotides interrupted by only 341 gaps.

    -encode only 20,00025,000 protein-coding genes.

  • 7/31/2019 Bio 108 Lec 4

    13/30

    Chromosomes and Chromatin

    Organism Genome size (Mb)a Chromosome numbera

    Yeast (Saccharomyces cerevisiae) 12 16

    Slime mold (Dictyostelium) 70 7

    Arabidopsis thaliana 130 5

    Corn 5,000 10

    Onion 15,000 8

    Lily 50,000 12

    Nematode (Caenorhabditis elegans) 97 6

    Fruit fly (Drosophila) 180 4

    Toad (Xenopus laevis) 3,000 18

    Lungfish 50,000 17

    Chicken 1,200 39

    Mouse 3,000 20

    Cow 3,000 30

    Dog 3,000 39

    Human 3,000 23

    a Both genome size and chromosome number are for haploid cells. Mb = millions of base pairs.

    Table 4.2. Chromosome Numbers of Eukaryotic Cells

  • 7/31/2019 Bio 108 Lec 4

    14/30

    -genomes of prokaryotes are contained in single chromosomes, which are

    usually circular DNA molecules.

    -the genomes of eukaryotes are composed of multiple chromosomes, each

    containing a linear molecule of DNA.

    -DNA of eukaryotic cells is tightly bound to small basic proteins (histones) that

    package the DNA in an orderly way in the cell nucleus(total extended length of

    DNA in a human cell is nearly 2 m, but this DNA must fit into a nucleus with a

    diameter of only 5 to 10 m).

    Chromatin

    -complexes between eukaryotic DNA and proteins

    -major proteins of chromatin are the histones

    small proteins containing a highproportion of basic amino acids (arginine and lysine) that facilitate

    binding to the negatively charged DNA molecule.

    - five major types of histonescalled

    H1, H2A, H2B, H3, and H4which are very similar among different species of

    eukaryotes

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886
  • 7/31/2019 Bio 108 Lec 4

    15/30

    Histonea Molecular weight Number ofamino

    acids

    Percentage Lysine +

    Arginine

    H1 22,500 244 30.8H2A 13,960 129 20.2

    H2B 13,774 125 22.4

    H3 15,273 135 22.9

    H4 11,236 102 24.5

    a Data are for rabbit (H1) and bovine histones.

    Table 4.3. The Major Histone Proteins

    -histones are extremely abundant proteins in eukaryotic cells; together, their

    mass is approximately equal to that of the cell's DNA.

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886
  • 7/31/2019 Bio 108 Lec 4

    16/30

    -not found in eubacteria (e.g., E. coli), but Archaebacteria, do contain histones that

    package their DNAs in structures similar to eukaryotic chromatin.

    -the basic structural unit of chromatin, the nucleosome, consisting of DNA (146-bp

    segment of DNA )wrapped around a histone core.

    Figure 4.8. The organization of chromatin in nucleosomes

    (A) The DNA is wrapped around histones in nucleosome core

    particles and sealed by histone H1. Nonhistone proteins

    bind to the linker DNA between nucleosome core particles.

    The linker DNA between the nucleosome core particles is

    preferentially sensitive, so limited digestion of chromatin

    yields fragments corresponding to multiples of 200 basepairs.

    Chromosomal DNA and Its Packaging in

    the Chromatin Fiber

    S l id

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.glossary.2886
  • 7/31/2019 Bio 108 Lec 4

    17/30

    Figure 4.10. Chromatin fibers The packaging of

    DNA into nucleosomes yields a chromatin fiber

    approximately 10 nm in diameter. The chromatin is

    further condensed by coiling into a 30-nm fiber,

    containing about six nucleosomes per turn.

    Solenoid structure

    Figure 4.9. Structure of a

    chromatosome (A) The nucleosome

    core particle consists of 146 base

    pairs of DNA wrapped 1.65 turns

    around a histone octamer consisting

    of two molecules each of H2A, H2B,

    H3, and H4. A chromatosome

    contains two full turns of DNA (166

    base pairs) locked in place by onemolecule of H1.

  • 7/31/2019 Bio 108 Lec 4

    18/30

    Figure 4.11. Model for the packing of chromatin and the

    chromosome scaffold in metaphase chromosomes. In

    interphase chromosomes, long stretches of 30-nm chromatin

    loop out from extended scaffolds. In metaphase

    chromosomes, the scaffold is folded into a helix and further

    packed into a highly compacted structure, whose precise

    geometry has not been determined.

    http://upload.wikimedia.org/wikipedia/commons/4/4b/Chromatin_Structures.png
  • 7/31/2019 Bio 108 Lec 4

    19/30

    The chromatin in chromosomal regions that are not being transcribed exists

    predominantly in the condensed, 30-nm fiber form. The regions of chromatin actively

    being transcribed are thought to assume the extended beads-on-a-string form

    Euchromatin -Decondensed, transcriptionally active interphase chromatin.Most of the euchromatin in interphase nuclei appears to be in the form of 30-nm fibers,

    organized into large loops containing approximately 50 to 100 kb of DNA. About 10% of

    the euchromatin, containing the genes that are actively transcribed, is in a more

    decondensed state (the 10-nm conformation) that allows transcription

    Heterochromatin-Condensed, transcriptionally inactive chromatin. Heterochromatinis transcriptionally inactive and contains highly repeated DNA sequences, such as

    those present at centromeres and telomeres.

    As cells enter mitosis, their chromosomes become highly condensed so that they can

    be distributed to daughter cells. The loops of 30-nm chromatin fibers are thought to

    fold upon themselves further to form the compact metaphase chromosomes of mitotic

    cells, in which the DNA has been condensed nearly 10,000-fold (Figure 4.11). Such

    condensed chromatin can no longer be used as a template for RNA synthesis, so

    transcription ceases during mitosis. Electron micrographs indicate that the DNA in

    metaphase chromosomes is organized into large loops attached to a protein scaffold

    Metaphase chromosomes are so highly condensed that their morphology can bestudied using the light microscope

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.figgrp.626http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=cooper.figgrp.626
  • 7/31/2019 Bio 108 Lec 4

    20/30

    However, euchromatin is interrupted by stretches ofheterochromatin, in which 30-nm fibers

    are subjected to additional levels of packing that usually render it resistant to gene

    expression. Heterochromatin is commonly found around centromeres and near telomeres,

    but it is also present at other positions on chromosomes.

    Figure 4-21. A comparison of

    extended interphase chromatin with

    the chromatin in a mitotic

    chromosome. (A) An electron

    micrograph showing an enormoustangle of chromatin spilling out of a

    lysed interphase nucleus. (B) A

    scanning electron micrograph of a

    mitotic chromosome: a condensed

    duplicated chromosome in which the

    two new chromosomes are stilllinked together . The constricted

    region indicates the position of the

    centromere. Note the difference in

    scales.

    E h DNA M l l Th t F Li Ch M t C t i C t

  • 7/31/2019 Bio 108 Lec 4

    21/30

    Each DNA Molecule That Forms a Linear Chromosome Must Contain a Centromere,

    Two Telomeres, and Replication Origins Figure 4-22. The three DNAsequences required to produce a

    eucaryotic chromosome that can be

    replicated and then segregated at

    mitosis. Each chromosome hasmultiple origins of replication, one

    centromere, and two telomeres.

    Shown here is the sequence of events

    a typical chromosome follows during

    the cell cycle. The DNA replicates in

    interphase beginning at the origins of

    replication and proceeding

    bidirectionally from the origins across

    the chromosome. In M phase, the

    centromere attaches the duplicated

    chromosomes to the mitotic spindle

    so that one copy is distributed to each

    daughter cell during mitosis. The

    centromere also helps to hold the

    duplicated chromosomes together

    until they are ready to be moved

    apart. The telomeres form special

    caps at each chromosome end.

  • 7/31/2019 Bio 108 Lec 4

    22/30

    Figure 4.17. Assay of a centromere

    in yeast Both plasmids shown

    contain a selectable marker (LEU2)

    and DNA sequences that serve as

    origins of replication in yeast (ARS,

    which stands for autonomously

    replicating sequence). However,

    plasmid I lacks a centromere and is

    therefore frequently lost as a result

    of missegregation during mitosis. In

    contrast, the presence of a

    centromere (CEN) in plasmid II

    ensures its regular transmission todaughter cells.

  • 7/31/2019 Bio 108 Lec 4

    23/30

    replication origin- Location on a DNA molecule at which duplication of the DNA begins.

    Centromere- Constricted region of a mitotic chromosome that holds sister chromatids

    together. It is also the site on the DNA where the kinetochore forms that captures

    microtubules from the mitotic spindle.

    Telomere-End of a chromosome, associated with a characteristic DNA sequence that isreplicated in a special way. Counteracts the tendency of the chromosome otherwise to

    shorten with each round of replication. (From Greek telos, end.)

    -perform another function: the repeated telomere DNA sequences, together

    with the regions adjoining them, form structures that protect the end of the chromosome

    from being recognized by the cell as a broken DNA molecule in need of repair.

    Figure 4.19. Structure of a telomere

    Telomere DNA loops back on itself to

    form a circular structure that protects

    the ends of chromosomes

  • 7/31/2019 Bio 108 Lec 4

    24/30

    Figure 4-11. The banding patterns of

    human chromosomes. Chromosomes 122

    are numbered in approximate order of size.

    A typical human somatic (non-germ line)

    cell contains two of each of thesechromosomes, plus two sex

    chromosomestwo X chromosomes in a

    female, one X and one Y chromosome in a

    male. The chromosomes used to make

    these maps were stained at an early stage

    in mitosis, when the chromosomes areincompletely compacted. The horizontal

    green line represents the position of the

    centromere, which appears as a

    constriction on mitotic chromosomes; the

    knobs on chromosomes 13, 14, 15, 21, and

    22 indicate the positions of genes that code

    for the large ribosomal RNAs (discussed in

    Chapter 6). These patterns are obtained by

    staining chromosomes with Giemsa stain,

    and they can be observed under the light

    microscope.

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.chapter.972http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.chapter.972
  • 7/31/2019 Bio 108 Lec 4

    25/30

    Karyotype-Full set of chromosomes of a cell arranged with respect to size, shape, and number.

    Figure 1-30. The genome ofE. coli.(A) A

    cluster ofE. colicells. (B) A diagram of the E.

    coligenome of 4,639,221 nucleotide pairs (for

    E. colistrain K-12). The diagram is circular

    because the DNA ofE. coli, like that of otherprocaryotes, forms a single, closed loop.

    Protein-coding genes are shown as yellowor

    orange bars, depending on the DNA strand

    from which they are transcribed; genes

    encoding only RNA molecules are indicated by

    green arrows. Some genes are transcribedfrom one strand of the DNA double helix (in a

    clockwise direction in this diagram), others

    from the other strand (counterclockwise).

    The Sequences of Complete Genomes

    Figure 4-13 The genome of S

  • 7/31/2019 Bio 108 Lec 4

    26/30

    Figure 4-13. The genome of S.

    cerevisiae (budding yeast). (A) The

    genome is distributed over 16

    chromosomes, and its complete

    nucleotide sequence was determined

    by a cooperative effort involvingscientists working in many different

    locations, as indicated (gray, Canada;

    orange, European Union; yellow,

    United Kingdom; blue, Japan; light

    green, St Louis, Missouri; dark green,

    Stanford, California). The constrictionpresent on each chromosome

    represents the position of its

    centromere . (B) A small region of

    chromosome 11, highlighted in redin

    part A, is magnified to show the high

    density of genes characteristic of thisspecies. As indicated by orange, some

    genes are transcribed from the lower

    strand ,while others are transcribed

    from the upper strand. There are

    about 6000 genes in the complete

    genome, which is 12,147,813nucleotide airs lon .

  • 7/31/2019 Bio 108 Lec 4

    27/30

    Figure 4-16. Scale of the human genome. Ifeach nucleotide pair is drawn as 1 mm as in

    (A), then the human genome would extend

    3200 km (approximately 2000 miles), far

    enough to stretch across the center of Africa,

    the site of our human origins (red line in B). At

    this scale, there would be, on average, aprotein-coding gene every 300 m. An average

    gene would extend for 30 m, but the coding

    sequences in this gene would add up to only

    just over a meter.

    The Human Genome

  • 7/31/2019 Bio 108 Lec 4

    28/30

    Figure 4-15. The organization of genes on a human

    chromosome. (A) Chromosome 22, one of the smallest

    human chromosomes, contains 48 106 nucleotide pairs

    and makes up approximately 1.5% of the entire human

    genome. Most of the left arm of chromosome 22 consists

    of short repeated sequences of DNA that are packaged ina particularly compact form of chromatin

    (heterochromatin), which is discussed later in this chapter.

    (B) A tenfold expansion of a portion of chromosome 22,

    with about 40 genes indicated. Those in dark brown are

    known genes and those in light brown are predicted

    genes. (C) An expanded portion of (B) shows the entire

    length of several genes. (D) The intron-exon arrangement

    of a typical gene is shown after a further tenfoldexpansion. Each exon (red) codes for a portion of the

    protein, while the DNA sequence of the introns (gray) is

    relatively unimportant. The entire human genome (3.2

    109 nucleotide pairs) is distributed over 22 autosomes and

    2 sex chromosomes (see Figures 4-10 and 4-11). The term

    human genome sequence refers to the complete

    nucleotide sequence of DNA in these 24 chromosomes.Being diploid, a human somatic cell therefore contains

    roughly twice this amount of DNA. Humans differ from

    one another by an average of one nucleotide in every

    thousand, and a wide variety of humans contributed DNA

    for the genome sequencing project. The published human

    genome sequence is therefore a composite of many

    individual sequences.

    Table 4-1. Vital Statistics of Human Chromosome 22 and the Entire Human Genome

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.610http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.611http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.611http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.611http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.611http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.610http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.610http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.figgrp.610
  • 7/31/2019 Bio 108 Lec 4

    29/30

    CHROMOSOME 22 HUMAN GENOME

    DNA length 48 106 nucleotide pairs* 3.2 109

    Number of genes approximately 700 approximately 30,000Smallest protein-coding gene 1000 nucleotide pairs not analyzed

    Largest gene 583,000 nucleotide pairs 2.4 106 nucleotide pairs

    Mean gene size 19,000 nucleotide pairs 27,000 nucleotide pairs

    Smallest number of exons per gene 1 1

    Largest number of exons per gene 54 178

    Mean number of exons per gene 5.4 8.8

    Smallest exon size 8 nucleotide pairs not analyzed

    Largest exon size 7600 nucleotide pairs 17,106 nucleotide pairs

    Mean exon size 266 nucleotide pairs 145 nucleotide pairs

    Number of pseudogenes** more than 134 not analyzed

    Percentage of DNA sequence in exons

    (protein coding sequences)

    3% 1.5%

    Percentage of DNA in high-copy

    repetitive elements

    42% approximately 50%

    Percentage of total human genome 1.5% 100%

    * The nucleotide sequence of 33.8 106 nucleotides is known; the rest of the chromosome consists primarily of very short

    repeated sequences that do not code for proteins or RNA.**

    A pseudogene is a nucleotide sequence of DNA closely resembling that of a functional gene, but containing numerousdeletion mutations that prevent its proper expression. Most pseudogenes arise from the duplication of a functional gene

  • 7/31/2019 Bio 108 Lec 4

    30/30

    Figure 4-17. Representation of the nucleotide sequence content of the human genome.

    LINES, SINES, retroviral-like elements, and DNA-only transposons are all mobile genetic

    elements that have multiplied in our genome by replicating themselves and inserting the

    new copies in different positions. Mobile genetic elements are discussed in Chapter 5. Simple

    sequence repeats are short nucleotide sequences (less than 14 nucleotide pairs) that are

    repeated again and again for long stretches. Segmental duplications are large blocks of the

    genome (1000200,000 nucleotide pairs) that are present at two or more locations in the

    genome. Over half of the unique sequence consists of genes and the remainder is probably

    regulatory DNA. Most of the DNA present in heterochromatin, a specialized type of

    chromatin (discussed later in this chapter) that contains relatively few genes has not yetb d

    http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.chapter.747http://www.ncbi.nlm.nih.gov/books/bv.fcgi?rid=mboc4.chapter.747