Messenger RNA Carries Information from DNA in a Three-Letter Genetic Code

As noted above, the genetic code used by cells is a triplet code, in which every three-nucleotide sequence, or codon, is “read” from a specified starting point in the mRNA. Of the 64 possible codons in the genetic code (one of four nucleotides at each of the three positions of a codon yields 4 × 4 × 4 = 64 possible codons), 61 specify individual amino acids, and three are stop codons. Table 5-1 shows that most amino acids are encoded by more than one codon. Only two—methionine and tryptophan—have a single codon; at the other extreme, leucine, serine, and arginine are each specified by six different codons. The different codons for a given amino acid are said to be synonymous. The code itself is termed degenerate, meaning that a particular amino acid can be specified by several codons.

image
*AUG is the most common initiation codon; GUG usually codes for valine and CUG for leucine, but rarely, these codons can also code for methionine to initiate a protein chain.

184

Synthesis of all polypeptide chains in prokaryotic and eukaryotic cells begins with the amino acid methionine. In bacteria, a specialized form of methionine with a formyl group linked to its amino group is used. In most mRNAs, the start (initiation) codon specifying this amino-terminal methionine is AUG. In a few bacterial mRNAs, GUG is used as the initiation codon, and CUG is occasionally used as an initiation codon for methionine in eukaryotes. The three codons UAA, UGA, and UAG do not specify amino acids, but rather constitute stop (termination) codons that mark the carboxyl terminus of polypeptide chains in almost all cells. The sequence of codons that runs from a specific start codon to a stop codon is called a reading frame. This precise linear array of ribonucleotides in groups of three in mRNA specifies the precise linear sequence of amino acids in a polypeptide chain and also signals where synthesis of the chain starts and stops.

image
FIGURE 5-18 Multiple reading frames in an mRNA sequence. If translation of the mRNA sequence shown begins at three different upstream start sites (not shown), then three overlapping reading frames are possible. In this example, the codons are shifted one base to the right in the middle frame and two bases to the right in the third frame, which ends in a stop codon. As a result, the same mRNA nucleotide sequence can specify different amino acids. Although regions of sequence that are translated in more than one of the three possible reading frames are rare, there are examples in both prokaryotes and eukaryotes, and especially in their viruses, in which the same sequence is used in two alternative mRNAs expressed from the same region of DNA, and the sequence is read in one reading frame in one mRNA and in an alternative reading frame in the other mRNA. There are even a few instances in which the same short sequence is read in all three possible reading frames.

Because the genetic code is a non-overlapping triplet code without divisions between codons, a particular mRNA theoretically could be translated in three different reading frames. Indeed, some mRNAs have been shown to contain overlapping information that can be translated in different reading frames, yielding different polypeptides (Figure 5-18). The vast majority of mRNAs, however, can be read in only one frame because stop codons encountered in the other two possible reading frames terminate translation before a functional protein is produced. Very rarely, another unusual coding arrangement occurs because of frame shifting. In this case, the protein-synthesizing machinery may read four nucleotides as one amino acid and then continue reading triplets, or it may back up one base and read all succeeding triplets in the new reading frame until termination of the chain occurs. Only a few dozen such instances are known.

The meaning of each codon is the same in most known organisms—strong evidence that life on Earth evolved only once. In fact, the genetic code shown in Table 5-1 is known as the universal code. However, the genetic code has been found to differ for a few codons in many mitochondria, in ciliated protozoans, and in Acetabularia, a single-celled plant. As shown in Table 5-2, most of these differences involve the reading of normal stop codons as amino acids, not an exchange of one amino acid for another. These exceptions to the universal code probably were later evolutionary developments; that is, at no single time was the code immutably fixed, although massive changes were not tolerated once a general code began to function early in evolution.

185

image
*Found in nuclear genes of the listed organisms and in mitochondrial genes as indicated.
SOURCE: Data from S. Osawa et al., 1992, Microbiol. Rev. 56:229.