Comparison of genomic DNA with messenger RNA reveals the intron–exon structure of genes.

In annotating an entire genome, researchers typically make use of information outside the genome sequence itself. This information may include sequences of messenger RNA molecules that are isolated from various tissues or various stages of development of the organism. Recall from Chapter 3 that messenger RNA (mRNA) molecules undergo processing and are therefore usually simpler than the DNA sequences from which they are transcribed—for example, introns are removed and exons are spliced together. The resulting mature mRNA therefore contains a long sequence of codons uninterrupted by a stop codon—in other words, an ORF. The ORF in an mRNA is the region that is actually translated into protein.

One aspect of genome annotation is the determination of which portions of the genome sequence correspond to sequences in mRNA transcripts. An example is shown in Fig. 13.5, which compares the DNA and mRNA for the beta (β) chain of hemoglobin, the oxygen-carrying protein in red blood cells. Note that the genomic DNA contains some sequences present in the mRNA, which correspond to exons, and some sequences that are not present in the mRNA, which correspond to introns. Comparison of mRNA with genomic DNA therefore reveals the intron–exon structure of protein-coding genes. In fact, introns were first discovered by comparing β-globin mRNA with genomic DNA.

277

image
FIG. 13.5 Identification of exons and introns by comparison of genomic DNA with mRNA sequence.