11.4 INITIATION OF DNA REPLICATION

The site (or multiple sites) on a chromosome where replication is initiated is called the origin. In bacteria, it is the primary point at which regulatory mechanisms control DNA replication. Control of initiation is more complex in eukaryotic cells than in bacteria, because eukaryotes have numerous origins on each chromosome. The total length of DNA replicated from one origin is called a replicon. Many bacteria have only one origin, and the replicon is the entire chromosome. In eukaryotic chromosomes, each replicon is the section of DNA replicated from one of its many origins.

Early genetic studies by François Jacob and his coworkers showed that replication starts at a particular place on the DNA, which they termed a replicator (now known as an origin). Genetic studies have since revealed numerous genes encoding the proteins needed for replication. These proteins fall into two classes: those that affect initiation and those that affect replication.

The two classes of proteins were identified by the speed at which their depletion, in E. coli mutants, affected DNA synthesis (Figure 11-28). Temperature-sensitive mutants for these proteins allowed incorporation of [3H]thymidine during DNA synthesis at a permissive temperature, but no DNA synthesis at a nonpermissive temperature. Temperature-sensitive genes encoding proteins that play a direct role in the replisome caused an abrupt end, or “fast stop,” of DNA synthesis when the mutant cells were transferred to nonpermissive growth conditions. For other genes, however, a “slow stop” of replication was observed, which suggested that these genes encode factors needed for initiation, not for fork progression (replication). These “slow-stop” mutants allowed already-initiated DNA synthesis to continue at the nonpermissive temperature until replication of the chromosome was finished.

Figure 11-28: Two types of replication genes revealed through genetic studies. Temperature-sensitive mutants of E. coli were analyzed for the time needed for replication to stop after shifting cells to a nonpermissive temperature. DNA replication was observed by the uptake of [3H]thymidine into cellular DNA (measured as counts per minute, cpm) at a permissive temperature (30° C, open circles) and nonpermissive temperature (40° C, solid circles). (a) A gene giving a fast-stop phenotype encodes a protein involved in progression of the replication fork. (b) A gene giving a slow-stop phenotype encodes a protein involved in the initiation of replication.

The initiator protein, which binds specific sites at the origin, is an example of a protein encoded by a slow-stop gene. Binding of the initiator protein to an origin provides a foothold for other proteins to bind, and often results in strand separation in a small region of DNA at the origin. Helicases are assembled at the unwound region, paving the way for more extensive DNA unwinding and the assembly of bidirectional replication forks.

Replication from an origin is a carefully orchestrated and controlled process that involves many different proteins. We first examine initiation at the E. coli origin, which is understood in great detail and serves to outline the basic events involved in initiation of replication in all cells. Then we describe our current understanding of how replication from the multiple origins of eukaryotic chromosomes is initiated and controlled.

Assembly of the Replication Fork Follows an Ordered Sequence of Events

The single E. coli origin, called oriC, was identified by a genetic technique using a recombinant DNA plasmid (see the How We Know section at the end of this chapter). The minimal E. coli origin is 245 bp long and contains four copies of a nine-nucleotide (9-mer) consensus sequence (called R sites, for “repeat”) to which the bacterial initiator protein, called DnaA, binds (Figure 11-29). To one side of the DnaA 9-mer sites are three A=T-rich direct repeats (sequences repeated with the same directionality) of 13 bp each. These A=T-rich repeats, referred to as the DNA unwinding element, are the first area in oriC to unwind after the initiator binds. Many origins of replication contain A=T-rich repeats that probably function in a similar way.

Figure 11-29: Structural elements of the E. coli origin. The E. coli origin, oriC, contains four 9-mer DNA sites (blue arrows) that bind the DnaA initiator protein. A possible fifth site deviates from the consensus sequence (not shown). The arrowheads indicate the relative direction of the 9-mers. The three 13-mer direct repeats are A=T-rich and are the locus of initial DNA strand separation (orange arrows). The oriC sequence also contains 11 GATC sites that are methylated by the Dam methylase (methylation sites indicated by arrowheads).

392

The E. coli initiator protein, DnaA, is a member of the AAA+ family. Like most AAA+ proteins, it binds and hydrolyzes ATP, although turnover is very slow. After one DnaA binds the origin, it oligomerizes and wraps the origin DNA around the oligomer (Figure 11-30, step 1). In the presence of ATP, DnaA destabilizes the A=T-rich 13-mer repeats, forming a single-stranded DNA bubble. Formation of this bubble is stimulated by HU, a small, basic, histonelike protein. Because the DnaA-ATP-oriC-HU complex forms a bubble at the origin, it is referred to as the open complex.

Figure 11-30: Activation of oriC and assembly of bacterial replication forks.

393

The DNA bubble in the open complex is the site where two hexamers of DnaB helicase are assembled (see Figure 11-30, step 2). Interaction of DnaB helicase with the DnaA initiator protein helps target DnaB to the origin, but it also requires a helicase-loading protein to assemble onto single-stranded DNA in the open complex. The helicase-loading protein is DnaC, another AAA+ protein. The helicase-loading activity of DnaC is thought to function by prying open the hexameric DnaB ring and slipping it onto the single-stranded DNA at the bubble. The ATP-bound form of DnaC binds DnaB helicase tightly and represses its helicase activity. Hydrolysis of ATP ejects DnaC from DnaB helicase, releasing DnaB for its DNA unwinding activity. Two hexamers of DnaB helicase are assembled on the origin, one on each strand of the single-stranded bubble. This assembled group of proteins on the DNA at oriC is referred to as the prepriming complex.

With the addition of ATP, the DnaB helicases translocate and unwind the DNA, dislodging the DnaA protein (see Figure 11-30, step 3). Unwinding generates positive supercoil stress in the DNA ahead of the replication fork, and this stress must be removed by topoisomerase action (e.g., gyrase). The newly unwound DNA is coated with SSB. Before DNA synthesis can begin, RNA primers must be synthesized by primase. The first RNA primer can be formed when the bubble grows to 100 to 200 bp. An RNA primer on each strand directs β clamp loading and the assembly of a Pol III core–β clamp complex within the holoenzyme (step 4). Each Pol III holoenzyme extends its RNA primer until each of them connects with the DnaB helicases traveling in the same direction. The coupled helicase-polymerase now moves rapidly, producing single-stranded DNA for primase to act upon, followed by clamp loading and engagement of another Pol III core–β clamp complex within each holoenzyme (step 5). This completes the assembly of two bidirectional replication forks at oriC.

Replication Initiation in E. coli Is Controlled at Multiple Steps

Cell division requires sufficient nutrients and cell mass to support two new cells, so replication must be coordinated with the cell’s nutritional status and growth. Regulation occurs at the initiation step, because once replication has begun, the cell is committed to division. It is also of paramount importance that the origin, once replicated, can be inactivated to prevent a second round of replication during the first round, which would commit the cell to splitting twice (resulting in four cells).

Binding of the DnaA initiator protein at oriC is a central point at which initiation is controlled. One mechanism for controlling initiation at oriC is through DNA methylation. Both strands of the palindromic sequence GATC are recognized by the enzyme Dam methylase (DNA adenine methyltransferase), which methylates the N6 position of A residues on both strands of the GATC site. Occurring at random, the average frequency of a GATC sequence would be once every 256 bp, yet the 245 bp oriC contains 11 GATC sites (see Figure 11-29). Immediately after a GATC site is replicated, the new strand is not yet methylated and the GATC site is thus hemimethylated. This hemimethylated state of newly replicated DNA is only temporary, until Dam methylase acts, but the high density of GATC sites in oriC delays complete methylation. The SeqA protein (Seq for sequestration) binds specifically to hemimethylated GATC sites and thereby sequesters the newly replicated oriC, preventing DnaA from rebinding the replicated origin and initiating another replication event (Figure 11-31a). Dam methylase, working between SeqA dissociation-reassociation cycles, eventually methylates the GATC sites in oriC, and this blocks SeqA binding and opens up the origin to DnaA binding once more.

Figure 11-31: Regulation of the E. coli origin. Initiation at the E. coli origin, oriC, is regulated in several ways. (a) SeqA protein binds hemimethylated DNA and sequesters the newly replicated origin, preventing DnaA binding. (b) DnaA-ADP, formed when DnaA hydrolyzes its ATP, cannot destabilize the A=T-rich region to maintain the open complex containing a single-stranded DNA bubble, thus forming a closed complex in which the bubble has collapsed. (c) The Hda protein binds the β clamp on the DNA, causing DnaA to hydrolyze its ATP and become inactive (DnaA-ADP). (d) RNA polymerase produces superhelical tension that promotes DnaA-induced melting of the A=T-rich region.

Initiation depends on the nucleotide-bound state of DnaA, which uses the energy of ATP binding to form the open complex at oriC (Figure 11-31b). When replication forks dislodge DnaA from the origin, it can rebind. But DnaA hydrolyzes the bound ATP after initiation, and even though ADP-DnaA can rebind the origin, it is unable to form the open complex, thus preventing reinitiation. Important to this regulatory step is that the exchange of free ATP for bound ADP on DnaA is slow, requiring up to half an hour—time for the cell division cycle to finish. ATP hydrolysis by DnaA is ensured by the Hda protein (Figure 11-31c). After replication forks start moving, Hda binds the β sliding clamp and stimulates ATP hydrolysis by DnaA, thus inactivating DnaA.

The number of DnaA-binding sites in the cell may also play a role in the control of reinitiation. The chromosome contains numerous DnaA-binding sites in various promoters, because DnaA is also a transcription regulator. In aggregate, these other DnaA sites far outnumber the few at the origin. Therefore, as chromosome duplication proceeds, the total number of DnaA-binding sites doubles, and these sites may act as a sink to lower the free DnaA available for binding to oriC.

In laboratory experiments, the RNA polymerase inhibitor rifampicin blocks replication in cells, suggesting that RNA polymerase, too, plays a role in chromosome replication. As it unwinds DNA ahead of the sequence it is transcribing, RNA polymerase creates supercoil strain in the DNA template. When RNA polymerase is near the origin, it stimulates the initiation of replication through this supercoil strain, probably by helping DnaA destabilize the A=T-rich 13-mer repeats involved in forming the open complex bubble (Figure 11-31d). For this reason, rifampicin is an effective antiviral for some RNA viruses that code for RNA polymerase as their sole polymerase.

394

Despite these several layers of control, there can be instances in which a bacterial cell is able to initiate replication again before the first cycle is complete. This occurs in times of abundant nutrients. However, as we will see shortly, eukaryotes have elaborate control schemes to prevent re-replication at origins until the first cycle of genome duplication is complete.

Eukaryotic Origins “Fire” Only Once per Cell Cycle

The much greater DNA content of eukaryotes, coupled with their slower replication forks, necessitates multiple origins on each chromosome to allow complete replication in the 24-hour division time of, for example, a human cell. Origins are spaced 10 to 40 kbp apart along each chromosome, and multiple replication forks eventually meet to yield the two daughter chromosomes. “Firing” (activation) of an origin is under tight control, as is reinitiation at an origin that has already been duplicated.

Eukaryotes have defined cell cycle phases, with chromosome replication occurring in S phase and separation of the duplicated chromosomes in M phase. (For an overview of the cell cycle, see Chapter 2.) A protein complex essential to replication is assembled on chromosome origins even before S phase. The assembly process occurs late in G1 phase and marks origins that will be used for replication during S phase. This separation of events in the cell cycle is critical to the exquisite coordination that eukaryotes require to duplicate their long, linear chromosomes.

The simple eukaryote S. cerevisiae has well-defined replication origins referred to as ARS (autonomously replicating sequences), which are 100 to 200 bp long and contain four common components: a highly conserved A sequence and the B1, B2, and B3 elements (Figure 11-32, top). Two-dimensional gel electrophoresis can be used to identify DNA segments that contain an origin (Highlight 11-1). The identification of discrete origins in yeast has made it a convenient model organism for studying origin function in eukaryotes. Furthermore, homologs of the yeast replication proteins exist in all eukaryotes, indicating that the lessons we learn from yeast will probably generalize to more complex eukaryotic organisms. However, unlike yeast, other eukaryotes have origins that lack easily identifiable sequence motifs.

Figure 11-32: Assembly of eukaryotic replication forks. The generalized structure of an origin in S. cerevisiae is shown at the top. The prereplication complex (preRC) assembles in G1 phase (middle). The initiator, ORC, binds to the conserved A element and the B1 element. MCM helicases are loaded onto the DNA by Cdc6 and Cdt1. After the cell progresses to S phase (bottom), the origin forms replication forks, as cyclin-dependent protein kinases (CDK and DDK) facilitate assembly of other proteins to form the replication complex (RC) from which replication fork movement commences.

The eukaryotic initiator is a heterohexamer called the origin recognition complex (ORC). Several subunits of ORC are, like the E. coli DnaA initiator protein, AAA+ proteins. ATP is required for ORC binding to the origin (see Figure 11-32, middle). After ORC binds to the DNA, the Cdc6 protein (also an AAA+ protein) binds to ORC. The ORC-Cdc6 complex then loads the Mcm2–7 complex onto DNA. The Mcm2–7 complex is a circular heterohexamer that binds one molecule of Cdt1, which is required before the ORC-Cdc6 complex can load the Mcm2–7 complex onto the DNA. Two Mcm2–7 complexes are loaded such that they encircle duplex DNA adjacent to the ORC-Cdc6 complex. These events occur only in G1 phase, and the resulting complex of ORC, Cdc6, Cdt1, and Mcm2–7 is referred to as the prereplication complex (preRC). The Cdc, Cdt, and Mcm names derive from genetic experiments that defined DNA metabolic pathways; the functions of these proteins were determined later, and in many cases the exact functions are still being worked out.

395

Cyclin-dependent protein kinases (also called cyclin kinases or cell cycle kinases) that phosphorylate certain target proteins are central to the separation of cell cycle phases. In G1 phase there is very low kinase activity, and proteins are generally not phosphorylated. On entering S phase, S. cerevisiae S-phase cyclin kinases phosphorylate some of the preRC proteins. Mcm4 and Mcm6 are targets of the cyclin kinase DDK. Phosphorylation by DDK enables Mcm2–7 complexes to dissociate from one another and move in opposite directions. The cyclin kinase CDK inactivates ORC by phosphorylating one of its subunits. In addition to inactivating ORC, these events lead to degradation of Cdc6 and Cdt1, and all these events prevent further preRC assembly until the cell has divided and reentered G1.

Several other replication factors associate with Mcm2–7 at origins early in S phase. Among these are Sld3 and Cdc45, which bind to the double hexameric Mcm2–7 complexes. Also, a preloading complex (pre-LC) forms, consisting of Pol ε, Sld2, Dpb2, and possibly other proteins. CDK phosphorylates Sld2 and Sld3, and the complex of Mcm2–7 and Cdc45 combines with GINS and Pol ε to form the CMG helicase at a replication fork, ejecting Dpb11, Sld2, and Sld3; full assembly of the replication forks can then proceed with association of Mcm10, Ctf4, Pol α, and Pol δ (see Figure 11-32, bottom). Phosphorylation by S-phase cyclin kinases is necessary for replication fork assembly and confines the initiation of replication to S phase. Only after the S, G2, and M phases are complete, and cell division is accomplished, is S-phase cyclin kinase activity reduced and Cdc6 and Cdt1 made available to direct the assembly of the preRC on the chromosomes of new cells in G1 phase.

SECTION 11.4 SUMMARY

  • The assembly of bacterial replication forks at the origin occurs in steps, starting with the binding of DnaA initiator protein, which melts an A=T-rich region. A DnaB helicase is then loaded onto each of the single strands of DNA by the DnaC helicase loader. As DNA is unwound by DnaB, DnaG primase synthesizes RNA primers; this is followed by entry of two Pol III holoenzymes to form a bidirectional replication fork.

    396

  • Origin activation in bacteria is regulated at the initiation step by various means, including DNA methylation that results in SeqA sequestering the origin, ATP turnover by DnaA, and the activity of Hda protein, which signals DnaA to hydrolyze ATP after forks are formed.

    397

  • Eukaryotes have many replication origins, and tight initiation control is achieved by dividing the activation of origins into different cell cycle phases. Some proteins can bind the origin only in G1 phase, when cyclin kinase activity is low (preRC). Further assembly of proteins to form replication forks occurs only in S phase and is associated with phosphorylation by S-phase cyclin kinases.

HIGHLIGHT 11-1 TECHNOLOGY: Two-Dimensional Gel Analysis of Replication Origins

Origins in the process of replication generate DNA molecules that contain bubbles and replication forks. These unusually shaped DNAs produce characteristic patterns in two-dimensional agarose gels (see Chapter 8). In this technique, the section of DNA to be examined for a replication origin is cut on either side of the origin with a restriction enzyme. In the first dimension of the gel, molecules are sorted mainly by size by low-voltage electrophoresis through a low-percentage agarose gel. An unreplicated 1 kbp fragment of DNA will travel farther through the gel than a replicated (2 kbp) fragment (Figure 1a). The second dimension is run at higher voltage to sort molecules mainly by shape. DNA fragments containing replication forks are less streamlined and will travel more slowly through the gel than unreplicated or completely replicated fragments. This two-dimensional assortment by size and shape of the same piece of DNA undergoing replication generates arc patterns. The DNA is analyzed by Southern blotting, in which DNA in the gel is transferred to nitrocellulose and probed with a radioactive DNA fragment that hybridizes to the region of interest (see Chapter 6).

FIGURE 1 Replication origins can be identified based on their mobility in a two-dimensional agarose gel. (a) The steps involved in two-dimensional gel electrophoresis, showing how an arc is generated. (b) The expected patterns from analysis of linear chromosomal DNA with more than one origin—the typical case.

Figure 1b shows DNA structures that result from replication initiating at two different origins on one section of DNA. Vertical dashed lines represent restriction sites. After digestion by restriction enzymes, three different restriction fragments are produced, RF1, RF2, and RF3. The three panels below the DNA structures represent the results of two-dimensional gel analysis of the cut DNA using a radioactive DNA probe—probe 1, 2, or 3. These probes hybridize specifically to RF1, RF2, or RF3. The lower left panel shows the Y arc pattern, using probe 1 and RF1. This pattern is produced by DNA that contains no origin of its own. Replication forks that enter the DNA produce Y-shaped DNAs that form differently sized and shaped fragments, depending on how far the fork travels into the fragment (fragments d, e, f, and g are produced in succession as the fork proceeds into the fragment). The top of the arc results from DNA containing three arms of equal length (fragment f). The middle panel shows the results when probe 2 is used, which hybridizes to RF2; this produces the bubble-to-Y arc pattern. This pattern is most indicative of an origin within the restriction fragment and occurs when the origin is located to one side of the center point of the fragment. Restriction fragments that contain bubbles (fragments a, b, and c) produce an arc that “breaks” when the bubble reaches the end of the fragment, to produce a Y-form DNA (fragments d and e are produced at this point). The right panel shows the bubble arc pattern generated with probe 3 and RF3. The origin in the center produces bubbles of increasing size. Due to the central location of the origin in this restriction fragment, the bubble does not produce a Y form on either end; therefore, the arc has no discontinuity and is only slightly different from the Y arc pattern.

398