Genomes of Many Organisms Contain Nonfunctional DNA

Comparisons of the total chromosomal DNA per cell in various species first suggested that much of the DNA in certain organisms does not encode functional RNA or have any apparent regulatory function. For example, yeasts, fruit flies, chickens, and humans have successively more DNA in their haploid chromosome sets (12.5, 180, 1300, and 3300 Mb, respectively), in keeping with what we perceive to be the increasing complexity of these organisms. Yet the vertebrates with the greatest amount of DNA per cell are amphibians, which are surely less complex than humans in their structure and behavior. Even more surprising, the unicellular protozoan Amoeba dubia has 200 times more DNA per cell than humans. Many plant species also have considerably more DNA per cell than humans have; tulips, for example, have 10 times as much DNA per cell as humans. The DNA content per cell also varies considerably between closely related species. All insects or all amphibians would appear to be similarly complex, but the amount of haploid DNA in species within each of these phylogenetic classes varies by a factor of 100.

Sequencing and identification of exons in chromosomal DNA have provided direct evidence that the genomes of higher eukaryotes contain large amounts of noncoding DNA. For instance, only a small portion of the β-globin gene cluster of humans, which is about 80 kb long, encodes protein (see Figure 8-4a). In contrast, a typical 80-kb stretch of DNA from the yeast S. cerevisiae, a single-celled eukaryote, contains many closely spaced protein-coding sequences without introns and relatively much less noncoding DNA (see Figure 8-4b). Moreover, the introns in globin genes are considerably shorter than those in most human genes. Globin proteins comprise about 50 percent of the total protein in developing red blood cells (erythroid progenitors), and the globin genes are expressed at maximum rates (i.e., a new RNA polymerase initiates transcription as soon as the previous polymerase transcribes far enough from the promoter to allow it to do so). Consequently, there has been selective pressure on globin genes for small introns that are compatible with the required high rate of globin mRNA transcription and processing. However, the vast majority of human genes are expressed at much lower levels, which require production of one encoded mRNA on a time scale of only tens of minutes or hours. Consequently, there has been little selective pressure to reduce the sizes of introns in most human genes.

The density of genes varies among regions of human chromosomal DNA, from “gene-rich” regions, where a few hundred base pairs separate transcription units, to large gene-poor “gene deserts,” where intergenic regions are a few million base pairs long. Of the 96 percent of human genomic DNA that has been sequenced, only about 2.9 percent corresponds to exons, and only about 1.2 percent encodes proteins. (The fraction of the genome that corresponds to exons is much larger than the fraction that encodes proteins because many protein-coding genes include exons for long 3′ untranslated regions and because there are many exons in nonprotein-coding lncRNAs; see Chapter 9.) We learned in the previous section that the intron sequences of most human genes are significantly longer than the exon sequences. Approximately 55 percent of human genomic DNA is thought to be transcribed into pre-mRNAs, pre-lncRNAs, or other nonprotein-coding RNAs in one cell or another, but some 95 percent of this sequence is intronic and is thus removed by RNA splicing. The remaining 45 percent of human DNA constitutes noncoding DNA between genes as well as the regions of repeated DNA sequences that make up the centromeres and telomeres of the human chromosomes. Consequently, about 97 percent of human DNA does not encode proteins, functional noncoding RNAs, or potentially functional lncRNAs.

310

Different selective pressures may account, at least in part, for the remarkable difference in the amount of nonfunctional DNA in different organisms. For example, many microorganisms must compete with other species of microorganisms in the same environment for limited amounts of available nutrients, and metabolic economy is thus a critical characteristic for these organisms. Because synthesis of nonfunctional (i.e., noncoding) DNA requires time, nutrients, and energy, presumably there was selective pressure to lose nonfunctional DNA during the evolution of rapidly growing microorganisms such as the yeast S. cerevisiae. On the other hand, natural selection in vertebrates depends largely on their behavior. The energy invested in DNA synthesis is trivial compared with the metabolic energy required for the movement of muscles and the function of the nervous system; thus there may have been little selective pressure on vertebrates to eliminate nonfunctional DNA. Furthermore, the replication time of cells in most vertebrates and plants is much longer than in rapidly growing microorganisms, so there may have been little selective pressure to eliminate nonfunctional DNA in order to permit rapid cellular replication.