28.3 DNA Replication Is Highly Coordinated

DNA replication must be very rapid, given the sizes of the genomes and the rates of cell division. The E. coli genome contains 4.6 million base pairs and is copied in less than 40 minutes. Thus, 2000 bases are incorporated per second. Enzyme activities must be highly coordinated to replicate entire genomes precisely and rapidly.

We begin our consideration of the coordination of DNA replication by looking at E. coli, which has been extensively studied. For this organism with a relatively small genome, replication begins at a single site and continues around the circular chromosome. The coordination of eukaryotic DNA replication is much more complex because there are many initiation sites throughout the genome and an additional enzyme is needed to replicate the ends of linear chromosomes.

DNA replication requires highly processive polymerases

Processive enzyme

From the Latin procedere, “to go forward.”

An enzyme that catalyzes multiple rounds of the elongation or digestion of a polymer while the polymer stays bound. A distributive enzyme, in contrast, releases its polymeric substrate between successive catalytic steps.

Replicative polymerases are characterized by their very high catalytic potency, fidelity, and processivity. Processivity refers to the ability of an enzyme to catalyze many consecutive reactions without releasing its substrate. These polymerases are assemblies of many subunits that have evolved to grasp their templates and not let go until many nucleotides have been added. The source of the processivity was revealed by the determination of the three-dimensional structure of the β2 subunit of the E. coli replicative polymerase called DNA polymerase III (Figure 28.21). This unit keeps the polymerase associated with the DNA double helix. It has the form of a star-shaped ring. A 35-Å-diameter hole in its center can readily accommodate a duplex DNA molecule and leaves enough space between the DNA and the protein to allow rapid sliding during replication. To achieve a catalytic rate of 1000 nucleotides polymerized per second requires that 100 turns of duplex DNA (a length of 3400 Å, or 0.34 mm) slide through the central hole of β2 per second. Thus, β2 plays a key role in replication by serving as a sliding DNA clamp.

840

Figure 28.21: Structure of a sliding DNA clamp. The dimeric β subunit of DNA polymerase III forms a ring that surrounds the DNA duplex. Notice the central cavity through which the DNA template slides. Clasping the DNA molecule in the ring, the polymerase enzyme is able to move without falling off the DNA substrate.
[Drawn from 2POL.pdb.]

How does DNA become entrapped inside the sliding clamp? Replicative polymerases also include assemblies of subunits that function as clamp loaders. These enzymes grasp the sliding clamp and, utilizing the energy of ATP binding, pull apart one of the interfaces between the two subunits of the sliding clamp. DNA can move through the gap, inserting itself through the central hole. ATP hydrolysis then releases the clamp, which closes around the DNA.

The leading and lagging strands are synthesized in a coordinated fashion

Replicative polymerases such as DNA polymerase III synthesize the leading and lagging strands simultaneously at the replication fork (Figure 28.22). DNA polymerase III begins the synthesis of the leading strand starting from the RNA primer formed by primase. The duplex DNA ahead of the polymerase is unwound by a hexameric helicase called DnaB. Copies of single-stranded-binding protein (SSB) bind to the unwound strands, keeping the strands separated so that both strands can serve as templates. The leading strand is synthesized continuously by polymerase III. Topoisomerase II concurrently introduces right-handed (negative) supercoils to avert a topological crisis.

Figure 28.22: Replication fork. A schematic view of the arrangement of DNA polymerase III and associated enzymes and proteins present in the replication of DNA. The helicase separates the two strands of the parent double helix, allowing DNA polymerases to use each strand as a template for DNA synthesis. Abbreviation: SSB, single-stranded-binding protein.

The mode of synthesis of the lagging strand is necessarily more complex. As mentioned earlier, the lagging strand is synthesized in fragments so that 5′ → 3′ polymerization leads to overall growth in the 3′ → 5′ direction. Yet the synthesis of the lagging strand is coordinated with the synthesis of the leading strand. How is this coordination accomplished? Examination of the subunit composition of the DNA polymerase III holoenzyme reveals an elegant solution (Figure 28.23). The holoenzyme includes two copies of the polymerase core enzyme, which consists of the DNA polymerase itself (the α subunit); the ε subunit, a 3′-to-5′ proofreading exonuclease; another subunit called θ; and two copies of the dimeric β-subunit sliding clamp. The core enzymes are linked to a central structure having the subunit composition γτ2δδ′χϕ. The γτ2δδ′ complex is the clamp loader, and the χ and ϕ subunits interact with the single-stranded-DNA–binding protein. The entire apparatus interacts with the hexameric helicase DnaB. Eukaryotic replicative polymerases have similar, albeit slightly more complicated, subunit compositions and structures.

Figure 28.23: DNA polymerase holoenzyme. Each holoenzyme consists of two copies of the polymerase core enzyme, which comprises the α, ε, and θ subunits and two copies of the β subunit, linked to a central structure. The central structure includes the clamp-loader complex and the hexameric helicase DnaB.

841

The lagging-strand template is looped out so that it passes through the polymerase site in one subunit of a dimeric polymerase III in the same direction as that of the leading-strand template in the other subunit, 5′ → 3′. DNA polymerase III lets go of the lagging-strand template after adding about 1000 nucleotides by releasing the sliding clamp. A new loop is then formed, a sliding clamp is added, and primase again synthesizes a short stretch of RNA primer to initiate the formation of another Okazaki fragment. This mode of replication has been termed the trombone model because the size of the loop lengthens and shortens like the slide on a trombone (Figure 28.24).

Figure 28.24: Trombone model. The replication of the leading and lagging strands is coordinated by the looping out of the lagging strand to form a structure that acts somewhat as a trombone slide, growing as the replication fork moves forward. When the polymerase on the lagging strand reaches a region that has been replicated, the sliding clamp is released and a new loop is formed.

842

The gaps between fragments of the nascent lagging strand are filled by DNA polymerase I. This essential enzyme also uses its 5′ → 3′ exonuclease activity to remove the RNA primer lying ahead of the polymerase site. The primer cannot be erased by DNA polymerase III, because the enzyme lacks 5′ → 3′ editing capability. Finally, DNA ligase connects the fragments.

DNA replication in Escherichia coli begins at a unique site

Figure 28.26: Assembly of DnaA. Monomers of DnaA bind to their binding sites (shown in yellow) in oriC and come together to form a complex structure, possibly the cyclic hexamer shown here. This structure marks the origin of replication and favors DNA strand separation in the AT-rich sites (green).
Figure 28.27: Prepriming complex. The AT-rich regions are unwound and trapped by the single-stranded-binding protein (SSB). The hexameric DNA helicase DnaB is loaded on each strand. At this stage, the complex is ready for the synthesis of the RNA primers and assembly of the DNA polymerase III holoenzyme.

In E. coli, DNA replication starts at a unique site within the entire 4.6 × 106 bp genome. This origin of replication, called the oriC locus, is a 245-bp region that has several unusual features (Figure 28.25). The oriC locus contains five copies of a sequence that is the preferred binding site for the origin-recognition protein DnaA. In addition, the locus contains a tandem array of 13-bp sequences that are rich in AT base pairs. Several steps are required to prepare for the start of replication:

1. The binding of DnaA proteins to DNA is the first step in the preparation for replication. DnaA is a member of the P-loop NTPase family related to the hexameric helicases. Each DnaA monomer comprises an ATPase domain linked to a DNA-binding domain at its C-terminus. DnaA molecules are able to bind to each other through their ATPase domains; a group of bound DnaA molecules will break apart on the binding and hydrolysis of ATP. The binding of DnaA molecules to one another signals the start of the preparatory phase, and their breaking apart signals the end of that phase. The DnaA proteins bind to the five high-affinity sites in oriC and then come together with DnaA molecules bound to lower-affinity sites to form an oligomer, possibly a cyclic hexamer. The DNA is wrapped around the outside of the DnaA hexamer (Figure 28.26).

2. Single DNA strands are exposed in the prepriming complex. With DNA wrapped around a DnaA hexamer, additional proteins are brought into play. The hexameric helicase DnaB is loaded around the DNA with the help of the helicase loader protein DnaC. Local regions of oriC, including the AT regions, are unwound and trapped by the single-stranded-DNA–binding protein. The result of this process is the generation of a structure called the prepriming complex, which makes single-stranded DNA accessible to other proteins (Figure 28.27). Significantly, the primase, DnaG, is now able to insert the RNA primer.

3. The polymerase holoenzyme assembles. The DNA polymerase III holoenzyme assembles on the prepriming complex, initiated by interactions between DnaB and the sliding-clamp subunit of DNA polymerase III. These interactions also trigger ATP hydrolysis within the DnaA subunits, signaling the initiation of DNA replication. The breakup of the DnaA assembly prevents additional rounds of replication from beginning at the replication origin.

Figure 28.25: Origin of replication in E. coli. The oriC locus has a length of 245 bp. It contains a tandem array of three nearly identical 13-nucleotide sequences (green) and five binding sites (yellow) for the DnaA protein.

843

DNA synthesis in eukaryotes is initiated at multiple sites

Replication in eukaryotes is mechanistically similar to replication in prokaryotes but is more challenging for a number of reasons. One of them is sheer size: E. coli must replicate 4.6 million base pairs, whereas a human diploid cell must replicate more than 6 billion base pairs. Second, the genetic information for E. coli is contained on 1 chromosome, whereas, in human beings, 23 pairs of chromosomes must be replicated. Finally, whereas the E. coli chromosome is circular, human chromosomes are linear. Unless countermeasures are taken, linear chromosomes are subject to shortening with each round of replication.

The first two challenges are met by the use of multiple origins of replication. In human beings, replication requires about 30,000 origins of replication, with each chromosome containing several hundred. Each origin of replication is the starting site for a replication unit, or replicon. DNA replication can be monitored by single-molecule methods, revealing bidirectional synthesis from different sites (Figure 28.28). In contrast with E. coli, the origins of replication in human beings do not contain regions of sharply defined sequence. Instead, more broadly defined AT-rich sequences are the sites around which the origin of replication complexes (ORCs) are assembled.

Figure 28.28: Eukaryotic origins of replication. The image shows a single molecule of DNA containing two origins of replication. The origins were identified by labeling new replicated DNA in human cells first with one thymine analog (iodo-deoxyuridine, I-dU) and then another (chloro-deoxyuridine, Cl-dU). DNA molecules from these cells were then extended on a microscope slide and labeled with antibodies to I-dU (green) and Cl-dU (red) to visualize the DNA. This method allows the detection of replication origins as well as determination of the rate of DNA synthesis.

1. The assembly of the ORC is the first step in the preparation for replication. In human beings, the ORC is composed of six different proteins, each homologous to DnaA. These proteins come together to form a hexameric structure analogous to the assembly formed by DnaA.

2. Licensing factors recruit a helicase that exposes single strands of DNA. After the ORC has been assembled, additional proteins are recruited, including Cdc6, a homolog of the ORC subunits, and Cdt1. These proteins, in turn, recruit a hexameric helicase with six distinct subunits called Mcm2-7. These proteins, including the helicase, are sometimes called licensing factors because they permit the formation of the initiation complex. After the initiation complex has formed, Mcm2-7 separates the parental DNA strands, and the single strands are stabilized by the binding of replication protein A, a single-stranded-DNA–binding protein.

3. Two distinct polymerases are needed to copy a eukaryotic replicon. An initiator polymerase called polymerase α begins replication but is soon replaced by a more processive enzyme. This process is called polymerase switching because one polymerase has replaced another. This second enzyme, called DNA polymerase δ, is the principal replicative polymerase in eukaryotes (Table 28.1).

Name

Function

Prokaryotic Polymerases

   DNA polymerase I

Erases primer and fills in gaps on lagging strand

   DNA polymerase II (error-prone polymerase)

DNA repair

   DNA polymerase III

Primary enzyme of DNA synthesis

Eukaryotic Polymerases

   DNA polymerase α

      Primase subunit

      DNA polymerase unit

Initiator polymerase

   Synthesizes the RNA primer

   Adds stretch of about 20 nucleotides to the primer

   DNA polymerase β (error-prone polymerase)

DNA repair

   DNA polymerase δ

Primary enzyme of DNA synthesis

Table 28.1: Some types of DNA polymerases

844

Figure 28.29: Eukaryotic cell cycle. DNA replication and cell division must take place in a highly coordinated fashion in eukaryotes. Mitosis (M) takes place only after DNA synthesis (S). Two gaps (G1 and G2) in time separate the two processes.

Replication begins with the binding of DNA polymerase α. This enzyme includes a primase subunit, used to synthesize the RNA primer, as well as an active DNA polymerase. After this polymerase has added a stretch of about 20 deoxynucleotides to the primer, another replication protein, called replication factor C (RFC), displaces DNA polymerase α. Replication factor C attracts a sliding clamp called proliferating cell nuclear antigen (PCNA), which is homologous to the β2 subunit of E. coli polymerase III. The binding of PCNA to DNA polymerase δ renders the enzyme highly processive and suitable for long stretches of replication. Replication continues in both directions from the origin of replication until adjacent replicons meet and fuse. RNA primers are removed and the DNA fragments are ligated by DNA ligase.

The use of multiple origins of replication requires mechanisms for ensuring that each sequence is replicated once and only once. The events of eukaryotic DNA replication are linked to the eukaryotic cell cycle (Figure 28.29). The processes of DNA synthesis and cell division are coordinated in the cell cycle so that the replication of all DNA sequences is complete before the cell progresses into the next phase of the cycle. This coordination requires several checkpoints that control the progression along the cycle. A family of small proteins termed cyclins are synthesized and degraded by proteasomal digestion in the course of the cell cycle. Cyclins act by binding to specific cyclin-dependent protein kinases and activating them. One such kinase, cyclin-dependent kinase 2 (cdk2) binds to assemblies at origins of replication and regulates replication through a number of interlocking mechanisms.

Telomeres are unique structures at the ends of linear chromosomes

Whereas the genomes of essentially all prokaryotes are circular, the chromosomes of human beings and other eukaryotes are linear. The free ends of linear DNA molecules introduce several complications that must be resolved by special enzymes. In particular, complete replication of DNA ends is difficult because polymerases act only in the 5′ → 3′ direction. The lagging strand would have an incomplete 5′ end after the removal of the RNA primer. Each round of replication would further shorten the chromosome.

The first clue to how this problem is resolved came from sequence analyses of the ends of chromosomes, which are called telomeres (from the Greek telos, “an end”). Telomeric DNA contains hundreds of tandem repeats of a six-nucleotide sequence. One of the strands is G rich at the 3′ end, and it is slightly longer than the other strand. In human beings, the repeating G-rich sequence is AGGGTT.

845

The structure adopted by telomeres has been extensively investigated. Evidence suggests that they may form large duplex loops (Figure 28.30). The single-stranded region at the very end of the structure has been proposed to loop back to form a DNA duplex with another part of the repeated sequence, displacing a part of the original telomeric duplex. This loop-like structure is formed and stabilized by specific telomere-binding proteins. Such structures would nicely mask and protect the end of the chromosome.

Figure 28.30: Proposed model for telomeres. A single-stranded segment of the G-rich strand extends from the end of the telomere. In one model for telomeres, this single-stranded region invades the duplex to form a large duplex loop.

Telomeres are replicated by telomerase, a specialized polymerase that carries its own RNA template

How are the repeated sequences generated? An enzyme, termed telomerase, that executes this function has been purified and characterized. When a primer ending in GGTT is added to human telomerase in the presence of deoxynucleoside triphosphates, the sequences GGTTAGGGTT and GGTTAGGGTTAGGGTT, as well as longer products, are generated. Elizabeth Blackburn and Carol Greider discovered that the enzyme adding the repeats contains an RNA molecule that serves as the template for the elongation of the G-rich strand (Figure 28.31). Thus, telomerase carries the information necessary to generate the telomere sequences. The exact number of repeated sequences is not crucial.

Figure 28.31: Telomere formation. Mechanism of synthesis of the G-rich strand of telomeric DNA. The RNA template of telomerase is shown in blue and the nucleotides added to the G-rich strand of the primer are shown in red.
[Information from E. H. Blackburn, Nature 350:569–573, 1991.]

Subsequently, a protein component of telomerases also was identified. This component is related to reverse transcriptases, enzymes first discovered in retroviruses that copy RNA into DNA (Section 5.2). Thus, telomerase is a specialized reverse transcriptase that carries its own template. Telomerase is generally expressed at high levels only in rapidly growing cells. Thus, telomeres and telomerase can play important roles in cancer-cell biology and in cell aging.

Because cancer cells express high levels of telomerase, whereas most normal cells do not, telomerase is a potential target for anticancer therapy. A variety of approaches for blocking telomerase expression or blocking its activity are under investigation for cancer treatment and prevention.