11.2 THE CHEMISTRY OF DNA POLYMERASES

The first DNA polymerase was identified in the 1950s by Arthur Kornberg and his postdoctoral fellow, Robert Lehman (see the How We Know section at the end of this chapter). In so doing, they initiated what would become an entire field of study on DNA replication enzymology. Initially, E. coli DNA polymerase I (Pol I) was simply called “DNA polymerase,” as it was presumed to be the only DNA polymerase in the cell. We now know, after decades of study, that E. coli contains five different DNA polymerases, involved in a variety of cellular processes. In fact, Pol I mainly functions in the repair of damaged DNA, although it also carries out an important function in connecting Okazaki fragments during replication, as we will see later in this chapter. We focus here on Pol I because the study of this enzyme revealed features of DNA synthesis that are common to all DNA polymerases, and it remains the most intensively studied and well-characterized DNA polymerase.

DNA Polymerases Elongate DNA in the 5′→3′ Direction

Early work on Pol I led to the definition of two central requirements for DNA polymerization. First, all DNA polymerases require a template strand that guides the polymerization reaction according to the Watson-Crick base-pairing rules: where dC is present in the template, dG is added to the new strand, and so on (Figure 11-5a). Second, DNA polymerases require a primer strand that is complementary to the template and contains a free 3′-OH group to which a nucleotide can be added. In other words, DNA polymerases can only add nucleotides to a preexisting strand; they cannot synthesize DNA starting from only a template strand. Most primers are oligonucleotides of RNA rather than DNA. The free 3′ end of the primer, where nucleotides will be added, is called the primer terminus. The double-stranded RNA-DNA formed by the primer and template is the primed template. Specialized enzymes (primases) synthesize RNA primers when and where they are required (as discussed in Section 11.3).

Figure 11-5: The DNA polymerase reaction. (a) DNA polymerases require a primer strand and a template strand (i.e., a primed template). As dNTPs are added to the 3′-OH group of the primer strand, this strand grows in the 5′→3′ direction. Incoming dNTPs are complementary to the template strand. (b) The insertion and postinsertion sites in DNA polymerase properly align the primed template for sequential addition of incoming nucleotides.

Studies of Pol I confirmed that the nucleotide precursors to DNA are the four deoxyribonucleoside 5′-triphosphates (dNTPs). The studies also showed that the different dNTPs bind the same active site on Pol I. Pol I differentiates among dNTPs only after it undergoes a conformational change that checks for the proper geometry of the base pair formed between the bound dNTP and the matching base on the template strand. Only the correct geometry of an A=T or G≡C base pair fits into the active site. An incorrect fit results in dissociation of the dNTP and binding of a new one. Normally, the polymerase is able to distinguish the correct nucleotide with this method, but one in every 104 to 105 nucleotides is added incorrectly. In the event of a misincorporated nucleotide, the polymerase has a way to remove it, which we discuss shortly.

370

The 3′-hydroxyl group of the primer strand is activated to attack the α phosphorus of the incoming dNTP, resulting in attachment of a dNMP to the primer 3′ terminus and release of pyrophosphate (PPi) (see Figure 11-5a). The overall reaction—where the primer strand is denoted as (dNMP)n—is:

Following incorporation of a dNMP, Pol I must slide forward on the new 3′ terminus to incorporate another dNTP. Therefore, the DNA polymerase active site must be divided into at least two distinct sites (Figure 11-5b). The template nucleotide that will pair with the incoming dNTP is positioned in the insertion site. The primer strand 3′-terminal base pair is positioned in the postinsertion site. The 3′ OH on the terminal ribose of the primer strand then attacks the phosphodiester bond that connects the α and β phosphates of the incoming dNTP. This results in addition of one dNMP to the primer strand and the release of pyrophosphate. After addition of a dNMP to the primer terminus, the new terminal base pair occupies the insertion site and must be translocated to the postinsertion site, allowing the next template nucleotide and a new dNTP to occupy the insertion site. Translocation of DNA can occur by sliding of the enzyme or dissociation of the enzyme from the DNA, followed by rebinding with the terminal base pair in the postinsertion site.

DNA synthesis proceeds with only a minimal change in free energy, given that one phosphodiester bond is formed (with the addition of one dNMP to the 3′ primer terminus) at the expense of another, similar bond (between the α and β phosphates of the dNTP). However, noncovalent base-pairing and base-stacking interactions of the newly added nucleotide residue provide additional stabilization that favors incorporation of the correct dNMP into the growing DNA molecule. Pol I can also catalyze the reverse reaction, called pyrophosphorolysis, in which PPi and primed DNA drive Pol I to remove dNMPs from the primer strand and release them as dNTPs. In the cell, pyrophosphorolysis is largely prevented by the action of the enzyme pyrophosphatase, which splits pyrophosphate into two molecules of Pi, thereby removing PPi so that the reverse reaction cannot occur. Pyrophosphate hydrolysis is energetically favorable and goes to near completion.

371

At first glance, the use of a dNTP to form one dNMP link to DNA might seem to be a waste of energy. Why not use dNDP as the nucleotide precursor, which would produce the same DNA product and one inorganic phosphate (Pi) instead of PPi, which is then split into two Pi? The drawback would be that the reverse reaction could easily be initiated at any time, because Pi, the molecule that would initiate the reverse reaction, is abundant in the cell. Using triphosphate precursors ensures that the reverse reaction will not occur, because, as described above, PPi is eliminated by pyrophosphatase—making the DNA polymerase reaction irreversible under cellular conditions. This is probably why all known DNA polymerases use triphosphate precursors. The same strategy also applies to RNA polymerases, which use NTPs and release PPi during RNA synthesis.

The initial studies of Pol I, performed decades ago, demonstrated that it requires a primed template and dNTP precursors as substrates. Many different DNA polymerases have been studied over the years, from sources as diverse as phages and other viruses, various types of bacteria, archaea, and eukaryotes, and mitochondria from many different species. All known DNA polymerases use the mechanism shown in Figure 11-5. Indeed, RNA polymerases also use this same basic mechanism (described in Chapter 15).

Most DNA Polymerases Have DNA Exonuclease Activity

DNA nucleases are a class of enzymes that degrade DNA. Nucleases that shorten DNA from the ends are called exonucleases; endonucleases cut DNA at internal positions. All cells have several nucleases of both types that are used in various tasks. A biochemist who wants to study any other type of enzyme that uses DNA as a substrate must carefully purify the enzyme and remove all nucleases, otherwise the DNA will be destroyed by the nuclease activity, and the researcher will be unable to observe other enzymatic reactions that require DNA. This careful purification is what Kornberg thought he was doing as he purified Pol I from a cell extract, but he was troubled by his inability to separate what he thought was a contaminating exonuclease activity from Pol I. He finally had no choice but to come to the paradoxical conclusion that the same enzyme that makes DNA also degrades it. In fact, Kornberg found that Pol I has two different exonuclease activities: one starts at the 3′ end and degrades DNA in the 3′→5′ direction (opposite to the direction of DNA synthesis), and the other starts at the 5′ end and degrades DNA in the 5′→3′ direction. The active sites of the two DNA exonucleases of Pol I are distinct from each other and from the DNA polymerase active site.

KEY CONVENTION

Exonucleases that digest a DNA strand from the 3′ terminus are called 3′→5′ exonucleases, because the strand shortens at the 3′ end while the 5′ end remains intact. In contrast, 5′→3′ exonucleases digest DNA from the 5′ terminus, while the 3′ end remains intact.

The 3′→5′ Exonuclease As noted previously, DNA polymerases are typically very accurate and produce only about one error every 104 to 105 nucleotides incorporated, by incorrect base selection. This error rate is improved 102- to 103-fold by a polymerase’s 3′→5′ exonuclease activity. When an incorrect dNMP is incorporated, the 3′→5′ exonuclease removes the mismatched nucleotide, giving the polymerase a second chance at incorporating the correct one (Figure 11-6). This activity, known as proofreading, is not the same thing as pyrophosphorolysis, the reverse of the polymerization reaction—pyrophosphate is not involved, and the mismatched nucleotide is released as a dNMP, not a dNTP. Also, proofreading by the 3′→5′ exonuclease occurs at a separate active site from polymerization. The mismatched primer strand terminus repositions from the polymerase site to the exonuclease site, where water is activated to hydrolyze the 3′ nucleotide from the primer strand. Most DNA polymerases have this proofreading exonuclease activity, although some do not and thus become more capable of extending a DNA mismatch (leaving one mispaired base in the DNA product) or bypassing a damaged nucleotide in DNA, as discussed in Chapter 12.

Figure 11-6: The 3′→5proofreading exonuclease of DNA polymerases. The 3′→5′ exonuclease active site is distinct from the polymerization site and it proofreads the DNA polymerase product.

372

One way in which DNA polymerase makes errors is by incorporating dNTP tautomers (see Chapter 6) at the polymerase active site. Purine and pyrimidine tautomers can form non–Watson-Crick base pairs with nucleotides in the template strand. Sometimes these incorrect base pairs are indistinguishable in shape and size from correct A=T or G≡C base pairs, thus “fooling” the enzyme into incorporating the incorrect nucleotide. Tautomers exist only transiently, and they rapidly revert to their usual bonding structure. When a tautomer of an incorrect nucleotide is incorporated into the growing DNA chain and then rapidly reverts to its normal structure, the primer strand terminus becomes unpaired and no longer fits into the polymerase active site for addition of the next dNTP. This significantly slows further chain extension and gives time for the mispaired 3′ terminus to relocate from the polymerase site to the 3′→5′ exonuclease site, where the mispaired nucleotide is quickly removed. The DNA can then move back to the polymerase site, allowing the polymerase to have another try.

When base selection and proofreading are combined, Pol I leaves behind one net error for every 106 to 108 nucleotide additions. The DNA polymerase involved in chromosome replication has a similar error rate. How accurate must a DNA polymerase be to replicate the E. coli genome without making a mistake? Replication of the 4.6 × 106 bp (4.6 Mbp) chromosome requires polymerization of 9.2 × 106 dNTPs. An error rate of about 1 in 107 would result in only one incorrect nucleotide insertion per cell division. In fact, the observed accuracy of the overall replication process in E. coli is one error in 109 to 1010 polymerization events. The additional accuracy derives from a repair system that recognizes and removes mismatches that escape both the polymerase and the proofreading exonuclease activities (see Chapter 12). At this level of accuracy, only a single error is acquired in every 100 to 1,000 new cells.

The 5′→3′ Exonuclease Pol I also has a second exonuclease that degrades DNA in the 5′→3′ direction, the same direction as DNA synthesis. The 5′→3′ exonuclease is unique to Pol I and reflects the enzyme’s role in DNA repair. Pol I performs a host of clean-up functions during replication, recombination, and repair, which require the trimming of single-stranded DNA ends and the removal of RNA primers or DNA lesions. Both exonucleases of Pol I are applied to these tasks. However, only the 5′→3′ exonuclease is capable of functioning at the same time as the polymerase, because the two activities act in the same direction. As the 5′→3′ exonuclease degrades a DNA or RNA strand in the duplex, the polymerase simultaneously adds dNTPs behind it. This concerted action of 5′→3′ excision and DNA polymerization is called nick translation (shown in Figure 11-7 for removal of an RNA primer). “Nick translation” should not be confused with the term “translation,” the process of protein synthesis. “Nick translation” is simply a descriptive term that refers to the fact that a nick in the DNA gets translated (moved) along the length of the strand by repeated cycles of excision and polymerization. The nick in DNA during nick translation is a discontinuity in the phosphodiester backbone between the 3′ hydroxyl of one nucleotide and the 5′ phosphate of the adjacent nucleotide. After Pol I dissociates from DNA, the nick is sealed by DNA ligase (see Section 11.3).

Figure 11-7: Nick translation by Pol I. DNA polymerase I (Pol I) is organized into three major domains: DNA polymerase, 3′→5′ proofreading exonuclease, and 5′→3′ exonuclease. At a nick—here, the gap between lagging-strand fragments—Pol I degrades the RNA primer in the 5′→3′ direction, releasing rNMPs, and simultaneously extends the 3′ terminus with dNTPs in the same direction. The net result is movement of the nick in the 5′→3′ direction along the DNA until all RNA is removed. DNA ligase can then seal the fragments (not shown here).

373

Five E. coli DNA Polymerases Function in DNA Replication and Repair

Escherichia coli contains five different DNA polymerases (Table 11-1). The large excess of intracellular Pol I delayed the discovery of the other DNA polymerases. Then, in the 1970s, DNA polymerase II (Pol II) and DNA polymerase III (Pol III) were discovered in studies using a mutant strain of E. coli called polA, which is depleted of most Pol I. These studies showed that Pol III is the DNA polymerase that replicates the chromosome; it is sometimes referred to as a replicase, or chromosomal replicase. Pol II seems to be involved in DNA repair.

Pol IV and Pol V were not discovered until 1999. They are different from the other DNA polymerases in that they lack a 3′→5′ proofreading exonuclease and thus often incorporate the wrong nucleotide. These low-fidelity polymerases are produced in cells when the DNA sustains damage that stalls the replication fork (see Chapter 12). The low accuracy of Pol IV and Pol V enables them to insert an incorrect nucleotide opposite a damaged template base. Although this results in an error, it gets the replication fork moving again. The ability of the replication fork to move past a damaged site is a matter of life and death, and all cells—bacterial, archaeal, and eukaryotic alike—contain these error-prone “translesion” DNA polymerases.

DNA Polymerase Structure Reveals the Basis for Its Accuracy

The crystal structure of E. coli Pol I resembles a right hand, with domains referred to as the palm, thumb, and fingers. All DNA polymerases have these same structural features. The bound DNA lies on the palm domain, which contains the polymerase active site and is the most conserved feature among all DNA polymerases. The fingers domain contains the dNTP-binding site, and the thumb domain partially curves around the duplex portion of the primed template, tightening the grip on DNA (Figure 11-8).

Figure 11-8: The structure of Pol I. (a) Crystal structure of Thermus aquaticus Pol I bound to DNA. All DNA polymerases are shaped like a right hand and have domains referred to as fingers, thumb, and palm. The 3′→5′ exonuclease is in a separate domain from the polymerase active site. E. coli Pol I (not shown) also has a 5′→3′ exonuclease. (b) The 3′ terminus of the DNA binds to the palm, but neither the duplex DNA nor the template 5′ single-stranded DNA enters the cleft between the fingers and thumb. The dNTP-binding site is located in the fingers. (c) A mispaired 3′ terminus frays by about four nucleotides to insert into the 3′→5′ exonuclease site.

374

A dramatic conformational change occurs on the binding of a correct dNTP to the fingers domain. The domain rotates inward about 40°, carrying the dNTP down to the DNA template. In so doing, the enzyme forms an active-site enclosure with a shape that corresponds to a correct base pair. This conformation of Pol I is often referred to as the closed form, to distinguish it from the open form prior to dNTP binding (Figure 11-9). The A=T and G≡C base pairs have similar shapes, and either pair fits into the active-site cavity of the closed form. However, incorrect base pairs cannot be accommodated, and this prevents Pol I from completely closing around the DNA (Figure 11-10). Only when Pol I is fully closed do the catalytic moieties at the active site properly align for rapid catalysis. Thus, the accuracy of the polymerase functions at the level of shape recognition and provides a classic example of an induced-fit catalytic mechanism (see Chapter 5).

Figure 11-9: Open and closed forms of Pol I. (a) In the open form, a dNTP (red) binds to the fingers domain. (b) In the closed form, the fingers domain undergoes a 40° rotation that moves the dNTP into base-pairing position with the template and forms an active-site cavity that fits the shape of a correct Watson-Crick base pair.
Figure 11-10: Base pairing in the Pol I active-site cavity. The shape of the active site in the closed conformation. (a) Correct G≡C and A=T base pairs fit into the active site. (b) Incorrect base pairs do not fit.

375

Incorporation of an incorrect base pair is 10- to 1,000-fold slower than incorporation of a correct base pair, which causes the polymerase to pause prior to incorporation of an incorrect nucleotide. This kinetic pause gives time for the incorrect dNTP to dissociate from Pol I and for Pol I to bind another dNTP. Even when a mismatched dNTP is incorporated, the proofreading 3′→5′ exonuclease usually excises it. The 3′→5′ exonuclease is a separate domain located 20 to 30 Å from the polymerase active site. Stalling of catalysis by an incorrectly incorporated dNTP buys time for the mismatched primer strand terminus to relocate to the 3′→5′exonuclease domain for proofreading (see Figure 11-8). The slow incorporation yet rapid removal of a mispaired dNTP underlies the inherent accuracy of DNA polymerases. Accuracy is further enhanced by a vastly diminished rate of dNTP incorporation at a mismatched 3′ terminus (Figure 11-11).

Figure 11-11: Favored incorporation of a correct dNTP over an incorrect dNTP. When an incorrect dNTP enters the insertion site on Pol I, binding is readily reversed (back arrow). In the rare instance of incorporation of the incorrect dNTP, the process is slow due to the imperfect active-site fit in the closed form. In the favored (rapid) route, the mispaired 3′ terminus is shifted to the 3′→5′ exonuclease, and in the mismatched nucleotide is excised to re-form the original primed site. This allows Pol I to insert the correct nucleotide on the second try. Binding and incorporation of a correct dNTP is rapid and paves the way for the next round of incorporation. If the incorrect nucleotide remains, the mismatched DNA is slow to act as substrate for the next round of dNTP incorporation.

The polymerase active site contains two magnesium ions that are held in place by conserved Asp residues (Figure 11-12a). One Mg2+ deprotonates the primer 3′-OH group to form the 3′-O nucleophile. The other binds the incoming dNTP and facilitates departure of the pyrophosphate leaving group. This two-metal-ion-catalyzed reaction is remarkable in that no amino acid side chain plays a direct role in catalysis; the two metal ions do it all. The 3′→5′ exonuclease uses a similar two-metal-ion mechanism, in which one Mg2+ deprotonates H2O to form the nucleophile (HO) for hydrolysis of the 3′ dNMP, and the other promotes departure of the leaving group by stabilizing the charge of the dNMP product (Figure 11-12b). The exclusive use of metal ions to catalyze DNA synthesis suggests that the first polymerase may have been an RNA molecule, as RNA is highly effective in the coordination of metal ions (as discussed in Chapter 16).

Figure 11-12: The role of metal ions in DNA polymerase catalysis. (a) The DNA polymerase active site contains two divalent metal ions (Mg2+) held in place by residues in the palm, including conserved Asp residues. The two ions play an essential role in catalysis, as described in the text. (b) The 3′→5′ exonuclease also uses a two-metal-ion-catalyzed reaction.

Surprisingly, certain mutations result in DNA polymerases that are even more accurate than the wild-type polymerases. Many of these “antimutator” DNA polymerases have a hyperactive 3′→5′ exonuclease; they even excise perfectly good bases. Cells with antimutator DNA polymerases display lower levels of spontaneous mutation. Why haven’t cells with the more accurate DNA polymerases been selected by evolution? The energy cost of using only a highly accurate polymerase must outweigh the benefit; spontaneous mutation also provides variation within a population, which is a necessary component of evolution.

376

Processivity Increases the Efficiency of DNA Polymerase Activity

During DNA synthesis, the DNA product must be moved to the postinsertion site so that the new 3′ terminus lies in the insertion site for addition of the next dNTP. To accomplish this repositioning, a polymerase can take either of two paths. It can fully dissociate from the DNA and rebind the primer strand terminus in the postinsertion site; this dissociation followed by rebinding at each nucleotide addition is referred to as distributive synthesis. Alternatively, the polymerase may simply slide forward one base pair along the DNA to reposition the 3′ terminus, without dissociating from the DNA. When a polymerase remains attached to DNA during multiple catalytic cycles, the process is known as processive synthesis.

The “processivity number” is the average number of nucleotides incorporated before the enzyme dissociates from the DNA. Processivity can result in exceedingly efficient polymerization, because much time is wasted by a dissociated polymerase in locating and rebinding a 3′ primer strand terminus. Pol I has a processivity number of 10 to 100 nucleotides, depending on conditions. In contrast, Pol III, like most DNA polymerases that replicate chromosomes (i.e., replicases), has a processivity number in the thousands, which, as we will see, results from a protein ring that encircles the DNA.

SECTION 11.2 SUMMARY

  • DNA polymerases require a primed template and extend the 3′ terminus of the primer strand by reaction with dNTPs.

  • A dNTP that correctly base-pairs to the template strand is incorporated into the primer strand with the release of pyrophosphate. Pyrophosphate is hydrolyzed by pyrophosphatase, which reduces the concentration of pyrophosphate in the cell and makes the reverse reaction extremely unlikely.

  • DNA polymerases are inherently very accurate and are made even more accurate by a proofreading 3′→5′ exonuclease.

  • Pol I also has a 5′→3′ exonuclease that degrades DNA while the polymerase synthesizes DNA, in the process of nick translation.

  • E. coli has five DNA polymerases. Pol III is responsible for chromosome replication, and Pol I is used to remove RNA primers and fill in the resulting gaps with DNA. The other three (II, IV, and V) are involved in DNA repair and in moving replication forks past sites of DNA damage.

  • Binding of a dNTP to a DNA polymerase results in a large conformational change, yielding an active site in which only correct base pairs fit.

  • DNA polymerases often have high processivity, in which many nucleotides are added to a DNA chain in one polymerase-binding event.

377