Eukaryotic Precursor mRNAs Are Processed to Form Functional mRNAs
FIGURE 5-14 Structure of the 5′ methylated cap. The distinguishing chemical features of the 5′ methylated cap on eukaryotic mRNA are (1) the 5′→5′ linkage of 7-methylguanylate to the initial nucleotide of the mRNA molecule and (2) the methyl group on the 2′ hydroxyl of the ribose of the first nucleotide (base 1). Both of these features occur in all animal cells and in cells of higher plants; yeasts lack the methyl group on nucleotide 1. The ribose of the second nucleotide (base 2) is also methylated in vertebrates. See A. J. Shatkin, 1976, Cell 9:645.
In bacterial cells, which have no nuclei, translation of an mRNA into protein can begin at the 5′ end of the mRNA even while the 3′ end is still being synthesized by RNA polymerase. In other words, transcription and translation occur concurrently in bacteria. In eukaryotic cells, however, the site of RNA synthesis—the nucleus—is separated from the site of translation—the cytoplasm. Furthermore, the primary transcripts of protein-coding genes are precursor mRNAs (pre-mRNAs) that must undergo several modifications, collectively termed RNA processing, to yield a functional mRNA (see Figure 5-1, step 2). This mRNA then must be exported to the cytoplasm before it can be translated into protein. Thus transcription and translation cannot occur concurrently in eukaryotic cells.
All eukaryotic pre-mRNAs are initially modified at the two ends, and these modifications are retained in mRNAs. As the 5′ end of a nascent RNA chain emerges from the surface of RNA polymerase, it is immediately acted on by several enzymes that together synthesize the 5′ cap, a 7-methylguanylate that is connected to the terminal nucleotide of the RNA by an unusual 5′,5′ triphosphate linkage (Figure 5-14). The cap protects an mRNA from enzymatic degradation and assists in its export to the cytoplasm. The cap is also bound by a protein factor required to begin translation in the cytoplasm.
Processing at the 3′ end of a pre-mRNA involves cleavage by an endonuclease to yield a free 3′-hydroxyl group, to which a string of adenylic acid residues is added one at a time by an enzyme called poly(A) polymerase. The resulting poly(A) tail contains 100–250 bases, being shorter in yeasts and invertebrates than in vertebrates. Poly(A) polymerase is part of a complex of proteins that can locate and cleave a transcript at a specific site and then add the correct number of A residues, in a process that does not require a template. As discussed further in Section 5.4 and in Chapter 10, the poly(A) tail has important functions both in translation of mRNA and in stabilizing pre-mRNAs in the nucleus and fully processed mRNAs in the nucleus and cytoplasm.
Another step in the processing of many different eukaryotic mRNA molecules is RNA splicing: the internal cleavage of a transcript to excise the introns and stitch together the coding exons. Figure 5-15 summarizes the basic steps in eukaryotic mRNA processing using the β-globin gene as an example. We examine the cellular machinery for carrying out processing of mRNA, as well as tRNA and rRNA, in Chapter 10.
The functional eukaryotic mRNAs produced by RNA processing retain noncoding regions, referred to as untranslated regions (UTRs), at each end. In mammalian mRNAs, the 5′ UTR may be a hundred or more nucleotides long, and the 3′ UTR may be several kilobases in length. Bacterial mRNAs also usually have 5′ and 3′ UTRs, but these regions are much shorter than those in eukaryotic mRNAs, generally containing fewer than 10 nucleotides. As discussed in Chapter 10, the 5′ UTR and 3′ UTR sequences participate in regulation of mRNA translation and stability, and 3′ UTRs also function in the localization of many mRNAs to specific regions of the cytoplasm.
FIGURE 5-15 Overview of RNA processing. RNA processing produces functional mRNA in eukaryotes. The β-globin gene contains three protein-coding exons (constituting the coding region) and two intervening noncoding introns. The introns interrupt the protein-coding sequence between the codons for amino acids 31 and 32 and 105 and 106. Transcription of eukaryotic protein-coding genes starts before the sequence that encodes the first amino acid and extends beyond the sequence that encodes the last amino acid, resulting in noncoding regions at the ends of the primary transcript. These untranslated regions (UTRs) are retained during processing. The 5′ cap (m7Gppp) is added during formation of the primary RNA transcript, which extends beyond the poly(A) site. After cleavage at the poly(A) site and addition of multiple A residues to the 3′ end, splicing removes the introns and joins the exons. The small numbers refer to positions in the 147–amino acid sequence of β-globin.