Characteristics of the Genetic Code

The genetic code is so important to modern biology that Francis Crick compared its place to that of the periodic table of the elements in chemistry. We will now examine a number of features of the genetic code.

THE DEGENERACY OF THE CODE One amino acid is encoded by three consecutive nucleotides in mRNA, and each nucleotide can have one of four possible bases (A, G, C, and U), so there are 43 = 64 possible codons (Figure 11.5). Three of these codons are stop codons, which specify the end of translation, as we'll see shortly. Thus, 61 codons, called sense codons, encode amino acids. Because there are 61 sense codons and only 20 different amino acids commonly found in proteins, the code contains more information than is needed to specify the amino acids and is said to be degenerate. This expression does not mean that the genetic code is depraved; degenerate is a term that Francis Crick borrowed from quantum physics, where it describes multiple physical states that have equivalent meaning. The degeneracy of the genetic code means that amino acids may be specified by more than one codon. Only tryptophan and methionine are encoded by a single codon (see Figure 11.5). Other amino acids are specified by two or more codons, and some, such as leucine, are specified by six different codons. Codons that specify the same amino acid are said to be synonymous codons, just as synonymous words are different words that have the same meaning.

image
Figure 11.5: The genetic code consists of 64 codons. The amino acids specified by each codon are given in their three-letter abbreviations. The codons are written 5′→3′, as they appear in the mRNA. AUG is an initiation (start) codon as well as the codon for methionine; UAA, UAG, and UGA are termination (stop) codons.

As we learned in Chapter 10, tRNAs serve as adapter molecules that bind particular amino acids and deliver them to a ribosome, where the amino acids are then assembled into polypeptide chains. Each type of tRNA attaches to a single type of amino acid. The cells of most organisms possess from about 30 to 50 different tRNAs, and yet there are only 20 different amino acids commonly found in proteins. Thus, some amino acids are carried by more than one tRNA. Different tRNAs that accept the same amino acid but have different anticodons are called isoaccepting tRNAs.

294

Even though some amino acids can pair with multiple (isoaccepting) tRNAs, there are still more codons than anticodons. One anticodon can sometimes pair with different codons through flexibility in base pairing at the third position of the codon. Examination of Figure 11.5 reveals that many synonymous codons differ only in the third position. For example, serine is encoded by the codons UCU, UCC, UCA, and UCG, all of which begin with UC. When the codon of the mRNA and the anticodon of the tRNA join (Figure 11.6), the first (5′) base of the codon pairs with the third (3′) base of the anticodon, strictly according to the Watson-and-Crick rules: A with U; C with G. Next, the middle bases of codon and anticodon pair, also strictly following the Watson-and-Crick rules. After these pairs have bonded, the third bases pair weakly, and there may be flexibility, or wobble, in their pairing; for example, a G in the anticodon may pair with either a C or a U in the third position of the codon. In 1966, Francis Crick developed the wobble hypothesis, which proposed that there could be some nonstandard pairings of bases at the third position of a codon. The important thing to remember about wobble is that it allows some tRNAs to pair with more than one codon on an mRNA.

image
Figure 11.6: Wobble may exist in the pairing of a codon and anticodon. The mRNA and tRNA pair in an antiparallel fashion. Pairing at the first and second codon positions is in accord with the Watson-and-Crick pairing rules (A with U, G with C); however, pairing rules are relaxed at the third position of the codon, and G on the anticodon can pair with either U or C on the codon in this example.

CONCEPTS

The genetic code consists of 61 sense codons that specify the 20 common amino acids. The code is degenerate, meaning that some amino acids are encoded by more than one codon. Isoaccepting tRNAs are tRNAs with different anticodons that specify the same amino acid. Wobble at the third position of the codon allows different codons to specify the same amino acid.

image CONCEPT CHECK 3

Through wobble, a single __________ can pair with more than one _____________.

  1. codon, anticodon

  2. group of three nucleotides in DNA, codon in mRNA

  3. tRNA, amino acid

  4. anticodon, codon

d

THE READING FRAME AND INITIATION CODONS Findings from early studies indicated that the genetic code is generally nonoverlapping. An overlapping code would be one in which a single nucleotide might be included in more than one codon, as follows:

image

Usually, however, each nucleotide is part of a single codon. A few overlapping genes are found in viruses, but codons within the same gene do not overlap, and the genetic code is generally considered to be nonoverlapping.

For any sequence of nucleotides, there are three potential sets of codons—three ways in which the sequence can be read in groups of three. Each different way of reading the sequence is called a reading frame, and any sequence of nucleotides has three potential reading frames. The three reading frames have completely different sets of codons and will therefore specify proteins with entirely different amino acid sequences. Thus, it is essential for the translational machinery to use the correct reading frame. How is the correct reading frame established? The reading frame is set by the initiation codon (or start codon), which is the first codon of the mRNA to specify an amino acid. After the initiation codon, the other codons are read as successive groups of three nucleotides. No bases are skipped between the codons, so there are no punctuation marks to separate the codons.

295

The initiation codon is usually AUG, although GUG and UUG are used on rare occasions. The initiation codon is not just a sequence that marks the beginning of translation; it also specifies an amino acid. In bacterial cells, the first AUG encodes a modified type of methionine, N-formylmethionine; all proteins in bacteria initially begin with this amino acid, but its formyl group (or, in some cases, the entire amino acid) may be removed after the protein has been synthesized. When the codon AUG is at an internal position in a gene, it encodes unformylated methionine. In archaeal and eukaryotic cells, AUG specifies unformylated methionine both at the initiation position and at internal positions. In both bacteria and eukaryotes, there are different tRNAs for the initiator methionine and internal methionine.

TERMINATION CODONS Three codons—UAA, UAG, and UGA—do not encode amino acids. These codons, which signal the end of the protein in both bacterial and eukaryotic cells, are called stop codons, termination codons, or nonsense codons. No tRNAs have anticodons that pair with termination codons.

THE UNIVERSALITY OF THE CODE For many years, the genetic code was assumed to be universal, meaning that each codon specifies the same amino acid in all organisms. We now know that the genetic code is almost, but not completely, universal; a few exceptions have been found. Most of these exceptions are termination codons, but there are a few cases in which one sense codon substitutes for another. Most exceptions are found in mitochondrial genes; a few nonuniversal codons have also been detected in the nuclear genes of protozoans and in bacterial DNA. image TRY PROBLEM 15

CONCEPTS

Each sequence of nucleotides possesses three potential reading frames. The correct reading frame is set by the initiation codon. The end of a protein-coding sequence is marked by a termination codon. With a few exceptions, all organisms use the same genetic code.

CONNECTING CONCEPTS

Characteristics of the Genetic Code

We have now considered a number of characteristics of the genetic code. Let’s take a moment to review these characteristics.

  1. The genetic code consists of a sequence of nucleotides in DNA or RNA. There are four letters in the code, corresponding to the four bases—A, G, C, and U (T in DNA).

  2. The genetic code is a triplet code. Each amino acid is encoded by a sequence of three consecutive nucleotides, called a codon.

  3. The genetic code is degenerate; that is, of 64 codons, 61 codons encode only 20 amino acids in proteins (3 codons are termination codons). Some codons are synonymous, specifying the same amino acid.

  4. Isoaccepting tRNAs are tRNAs with different anticodons that accept the same amino acid. Wobble allows the anticodon on one type of tRNA to pair with more than one codon on mRNA.

  5. The genetic code is generally nonoverlapping; each nucleotide in an mRNA sequence belongs to a single reading frame.

  6. The reading frame is set by an initiation codon, which is usually AUG.

  7. When a reading frame has been set, codons are read as successive groups of three nucleotides.

  8. Any one of three termination codons (UAA, UAG, or UGA) can signal the end of a protein; no amino acids are encoded by the termination codons.

  9. The genetic code is almost universal.