33.1 A Nucleic Acid Consists of Bases Linked to a Sugar–Phosphate Backbone

✓ 1 List the components of nucleic acids.

The nucleic acids DNA and RNA are well suited to functioning as the carriers of genetic information by virtue of their structures. These macromolecules are linear polymers built up from similar units connected end to end (Figure 33.2). Each monomer unit within the polymer consists of three components: a sugar, a phosphate, and a base. The sequence of bases uniquely characterizes a nucleic acid and is a form of linear information.

Figure 33.2: The polymeric structure of nucleic acids.

DNA and RNA Differ in the Sugar Component and One of the Bases

Figure 33.3: Ribose and deoxyribose. Atoms in sugar units are numbered with primes to distinguish them from atoms in bases.

The sugar in deoxyribonucleic acid (DNA) is deoxyribose. The “deoxy” prefix refers to the fact that the 2′-carbon atom of the sugar lacks the oxygen atom that is linked to the 2′-carbon atom of ribose (the sugar in ribonucleic acid, or RNA), as shown in Figure 33.3. The sugars in nucleic acids are linked to one another by phosphodiester bridges. Specifically, the 3′-hydroxyl (3′-OH) group of the sugar component of one nucleotide is bonded to a phosphoryl group, and the phosphoryl group is, in turn, joined to the 5′-hydroxyl group of the adjacent sugar, forming a phosphodiester linkage. The strand of sugars linked by phosphodiester bridges is referred to as the backbone of the nucleic acid (Figure 33.4). The backbone is constant in DNA and RNA, but the bases vary from one monomer to the next. Two of the bases are derivatives of purine—adenine and guanine—and two are derivatives of pyrimidine—cytosine and thymine (DNA only) or uracil (RNA only), as shown in Figure 33.5.

Figure 33.4: Backbones of DNA and RNA. The backbones of the nucleic acids are formed by 3′-to-5′ phosphodiester linkages. The difference in the sugar component of RNA and DNA is highlighted in red.
Figure 33.5: Purines and pyrimidines. Atoms within bases are numbered without primes. Uracil is present in RNA instead of thymine.

Ribonucleic acid (RNA), like DNA, is a long unbranched polymer consisting of nucleotides joined by 3′ → 5′ phosphodiester linkages (Figure 33.4). The structure of RNA differs from that of DNA in two respects: (1) the sugar units in RNA are riboses rather than deoxyriboses and (2) one of the four major bases in RNA is uracil instead of thymine (Figure 33.5).

Note that each phosphodiester linkage in the backbone of both DNA and RNA has a negative charge. This negative charge repels nucleophilic species such as hydroxide ion, which are capable of hydrolytically cleaving the phosphodiester linkages of the nucleic acid backbone. This resistance is crucial for maintaining the integrity of information stored in nucleic acids. The absence of the 2′-hydroxyl group in DNA further increases its resistance to hydrolysis. In the presence of nucleophilic species, a 2′-hydroxyl group would hydrolyze the phosphodiester linkage and cause a break in the nucleic acid backbone. The greater stability of DNA is one of the reasons for its use rather than RNA as the hereditary material in all modern cells and in many viruses.

609

Nucleotides Are the Monomeric Units of Nucleic Acids

Figure 33.6: β-Glycosidic linkage in a purine nucleoside.

We considered the synthesis of nucleotides, the building blocks of nucleic acids, in Chapter 32. Let’s review the nomenclature of these crucial biomolecules. A unit consisting of a base bonded to a sugar is referred to as a nucleoside. The four nucleoside units in DNA are called deoxyadenosine, deoxyguanosine, deoxycytidine, and thymidine. Note that thymidine contains deoxyribose; however, by convention, the prefix “deoxy” is not added, because thymine-containing nucleotides are found only rarely in RNA. The nucleoside units in RNA are called adenosine, guanosine, cytidine, and uridine. In each case, N-9 of a purine or N-1 of a pyrimidine is attached to the 1′-carbon atom (C-1′) of the sugar (Figure 33.6). The base lies above the plane of the sugar when the structure is written in the standard orientation; that is, the configuration of the N-glycosidic linkage is β.

610

A nucleotide is a nucleoside joined to one or more phosphoryl groups by an ester linkage and is most commonly referred to as a nucleoside with the number of attached phosphoryl groups noted. For instance, a nucleoside monophosphate is a nucleotide. Nucleoside triphosphates are the monomers—the building blocks—that are linked to form RNA and DNA. The four nucleotide units that link to form DNA are nucleoside monophosphates called deoxyadenylate, deoxyguanylate, deoxycytidylate, and thymidylate.

The base name with the suffix “ate” alone is a nucleotide, but with an unknown number of phosphoryl groups attached at an undesignated carbon atom of the sugar. A more precise nomenclature also is commonly used. A compound formed by the attachment of a phosphoryl group to C-5′ of a nucleoside sugar (the most common site of phosphate esterification) is called a nucleoside 5′-phosphate or a 5′-nucleotide. In this naming system for nucleotides, the number of phosphoryl groups and the attachment site are designated. For example, ATP is adenosine 5′-triphosphate. Another example of a nucleotide is deoxyguanosine 3′-monophosphate (3′-dGMP; Figure 33.7). This nucleotide differs from ATP in several ways: it contains guanine rather than adenine and deoxyribose rather than ribose (indicated by the prefix “d”), it has one phosphoryl group rather than three, and it has the phosphoryl group esterified to the hydroxyl group in the 3′ rather than the 5′ position.

Figure 33.7: Nucleotides adenosine 5′-triphosphate (5′-ATP) and deoxyguanosine 3′-monophosphate (3′-dGMP).

DNA Molecules Are Very Long and Have Directionality

Figure 33.8: The structure of a DNA strand. The strand has a 5′ end, which is usually attached to a phosphoryl group, and a 3′ end, which is usually a free hydroxyl group.

Scientific communication frequently requires the sequence of a nucleic acid—in some cases, a sequence thousands of nucleotides in length—to be written. Rather than writing the cumbersome chemical structures, scientists have adopted the use of abbreviations. The abbreviated notations pApCpG, pACG, or, most commonly, ACG denote a trinucleotide. In regard to DNA, the trinucleotide consists of deoxyadenylate monophosphate, deoxycytidylate monophosphate, and deoxyguanylate monophosphate joined together by phosphodiester linkages, in which “p” denotes a phosphoryl group. The phosphoryl groups and sugar components of the trinucleotide are abbreviated as well, as shown in Figure 33.8. The 5′ end will often have a phosphoryl group attached to the 5′-OH group, which means that a DNA or RNA strand has directionality. One end of the strand has a free 5′-OH group (or a 5′-OH group attached to a phosphoryl group), whereas the other end has a 3′-OH group, and neither end is linked to another nucleotide. By convention, the base sequence is written in the 5-to-3direction. Thus, the symbol ACG indicates that the phosphorylated or unlinked 5′-OH group is on deoxyadenylate (or adenylate), whereas the unlinked 3′-OH group is on deoxyguanylate (or guanylate). Because of this directionality, ACG and GCA correspond to different compounds.

611

Figure 33.9: An electron micrograph of part of the E. coli genome. The E. coli was lysed, extruding the DNA.

A striking characteristic of naturally occurring DNA molecules is their length. A DNA molecule must comprise many nucleotides to carry the genetic information necessary for even the simplest organisms. For example, the DNA of a virus such as polyoma, which can cause cancer in certain organisms, is 5100 nucleotides in length. The E. coli genome is a single DNA molecule consisting of two strands of 4.6 million nucleotides (Figure 33.9).

DNA molecules from higher organisms can be much larger. The human genome comprises approximately 3 billion base pairs distributed among 24 distinct DNA molecules—22 autosomes plus two sex chromosomes (X and Y)—of different sizes. One of the largest known DNA molecules is found in the Indian muntjac, an Asiatic deer; its genome is nearly as large as the human genome but is distributed among only 3 chromosomes (Figure 33.10). The largest of these chromosomes has strands of more than 1 billion base pairs. If such a DNA molecule could be fully extended, it would stretch more than 1 foot in length. Some plants contain even larger DNA molecules.

DID YOU KNOW?

Most human cells have 6 billion base pairs of information. All 6 billion base pairs would be 3.6 m in length if all of the molecules were laid end to end. Human beings are composed of approximately 10 trillion cells. If all of this DNA were strung end to end, it would reach to the sun and back about 65 times.

Figure 33.10: The Indian muntjac and its chromosomes. Cells from a female Indian muntjac contain three pairs of very large chromosomes (stained orange in the micrograph). The cell shown is a hybrid containing a pair of human chromosomes (stained green) for comparison.