Chapter Introduction

RNA: Transcription and Processing

291

RNA: Transcription

and Processing

CHAPTER

8

LEARNING OUTCOMES

After completing this chapter, you will be able to

  • Describe how the structure of RNA differs from that of DNA.

  • Differentiate among the different classes of RNA in a cell.

  • Explain the function of promoters and the features necessary to start transcription.

  • Diagram the steps in RNA processing from its transcription to its transport out of the nucleus.

  • Appraise why the discovery of self-splicing introns is considered to be so important.

  • Describe the different types of noncoding RNAs (ncRNAs).

RNA polymerase in action. A very small RNA polymerase (blue), made by the bacteriophage T7, transcribes DNA into a strand of RNA (red). The enzyme separates the DNA double helix (yellow, orange), exposing the template strand to be copied into RNA.
[David S. Goodsell, Scripps Research Institute.]

OUTLINE

8.1

RNA

8.2

Transcription

8.3

Transcription in eukaryotes

8.4

Intron removal and exon splicing

8.5

Small functional RNAs that regulate and protect the eukaryotic genome

292

Using their newly acquired knowledge of the DNA sequences of entire genomes, scientists have been able to determine the approximate number of genes in several organisms, both simple and complex. At first there were no surprises: the bacterium Escherichia coli has about 4400 genes, the unicellular eukaryote yeast Saccharomyces cerevisiae has about 6300 genes, and the multicellular fruit fly Drosophila melanogaster has about 13,600 genes. Scientists assumed that more complex organisms would require more genes, and so early estimates were that our genome would have 100,000 genes. At a conference focused on genome research in 2000, scientists started an informal betting pool called GeneSweep that would be won by the person who most closely predicted the actual number of genes in the human genome. The entries ranged from ~26,000 to ~150,000 genes.

With the release of the first draft sequence, a winner was announced. Surprisingly, the winner was the entrant with the very lowest estimate, 25,947 genes. How could Homo sapiens with their complex brains and sophisticated immune systems have only twice as many genes as the roundworm and approximately the same number of genes as the first sequenced plant genome, the mustard weed Arabidopsis thaliana? Part of the answer to this question has to do with a remarkable discovery made in the late 1970s. At that time, the proteins of many eukaryotes were found to be encoded in DNA not as continuous stretches (as they are in bacteria and yeast) but in pieces. Thus, the genes of higher eukaryotes are usually composed of pieces called exons (for expressed region) that encode parts of proteins and pieces called introns (for intervening region) that separate exons. As you will learn in this chapter, an RNA copy containing both exons and introns is synthesized from a gene. A biological machine (called a spliceosome) removes the introns and joins the exons (in a process called RNA splicing) to produce a mature RNA that contains the continuous information needed to synthesize a protein.

What do exons and introns have to do with the low human gene count? For now, suffice it to say that the RNA transcribed from a gene can be spliced in alternative ways. Although we have only about 21,000 genes, these genes encode more than 100,000 proteins, thanks to the process of alternative splicing of RNA.

Even more surprising is the finding that only a small fraction of the genome actually codes for proteins (a little more than 2 percent for most complex multicellular organisms). The content of genomes will be the subject of future chapters. For now it is important to note that despite having such a small proportion of coding DNA, most of the genome still encodes RNA. The story of this aptly named non-protein-coding RNA (ncRNA) is a work in progress. That story will be introduced in this chapter and developed in succeeding chapters.

In this chapter, we see the first steps in the transfer of information from genes to gene products. Within the DNA sequence of any organism’s genome is encoded information specifying each of the gene products that the organism can make. These DNA sequences also contain information specifying when, where, and how much of the product is made. To utilize the information, an RNA copy of the gene must be synthesized in a process called transcription.

The transfer of information from gene to gene product takes place in several steps. The first step, which is the focus of this chapter, is to copy (transcribe) the information into a strand of RNA with the use of DNA as a template. In prokaryotes, the information in protein-encoding RNA is almost immediately converted into an amino acid chain (protein) by a process called translation. This second step is the focus of Chapter 9. In eukaryotes, transcription and translation are spatially separated: transcription takes place in the nucleus and translation in the cytoplasm. However, before RNAs are ready to be transported into the cytoplasm for translation or other uses, they undergo extensive processing, including the removal of introns and the addition of a special 5′ cap and a 3′ tail of adenine nucleotides. One fully processed type of RNA, called messenger RNA (mRNA), is the intermediary in the synthesis of proteins. In addition, in both prokaryotes and eukaryotes, there are other types of RNAs that are never translated. These ncRNAs perform many essential roles.

293

DNA and RNA function during transcription is based on two principles:

  1. Complementarity of bases is responsible for determining the sequence of the RNA transcript in transcription. Through the matching of complementary bases, the information encoded in the DNA passes into RNA, and protein complexes associated with ncRNAs are guided to specific regions in the RNA to regulate their expression.

  2. Certain proteins recognize particular base sequences in DNA and RNA. These nucleic-acid-binding proteins bind to these sequences and act on them.

We will see these principles at work throughout the detailed discussions of transcription and translation that follow in this chapter and in chapters to come.

KEY CONCEPT

The transactions of DNA and RNA take place through the matching of complementary bases and the binding of various proteins to specific sites on the DNA or RNA.