13.3 Bacterial Transcription Consists of Initiation, Elongation, and Termination

Now that we’ve considered some of the major components of transcription, we’re ready to take a detailed look at the process. Transcription can be conveniently divided into three stages:

We will first examine each of these steps in bacterial cells, where the process is best understood; then we will consider eukaryotic and archaeal transcription.

Initiation

Initiation comprises all the steps necessary to begin RNA synthesis, including (1) promoter recognition, (2) formation of the transcription bubble, (3) creation of the first bonds between rNTPs, and (4) escape of the transcription apparatus from the promoter.

Transcription initiation requires that the transcription apparatus recognize and bind to the promoter. At this step, the selectivity of transcription is enforced; the binding of RNA polymerase to the promoter determines which parts of the DNA template are to be transcribed and how often. Different genes are transcribed with different frequencies, and promoter binding is primarily responsible for determining the frequency of transcription for a particular gene. Promoters also have different affinities for RNA polymerase. Even within a single promoter, the affinity can vary with the passage of time, depending on the promoter’s interaction with RNA polymerase and a number of other factors.

Bacterial Promoters

Essential information for the transcription unit—where it will start transcribing, which strand is to be read, and in what direction the RNA polymerase will move—is imbedded in the nucleotide sequence of the promoter. Promoters are DNA sequences that are recognized by the transcription apparatus and are required for transcription to take place. In bacterial cells, promoters are usually adjacent to an RNA-coding sequence.

An examination of many promoters in E. coli and other bacteria reveals a general feature: although most of the nucleotides within the promoters vary in sequence, short stretches of nucleotides are common to many. Furthermore, the spacing and location of these nucleotides relative to the transcription start site are similar in most promoters. These short stretches of common nucleotides are called consensus sequences; “consensus sequence” refers to sequences that possess considerable similarity, or consensus (Figure 13.10) The presence of consensus in a set of nucleotides usually implies that the sequence is associated with an important function. TRY PROBLEM 21

Figure 13.10: A consensus sequence consists of the most commonly encountered bases at each position in a group of related sequences.

The most commonly encountered consensus sequence, found in almost all bacterial promoters, is centered about 10 bp upsteam of the start site. Called the –10 consensus sequence or, sometimes, the Pribnow box, its consensus sequence is

and is often written simply as TATAAT (Figure 13.11). Remember that TATAAT is just the consensus sequence—representing the most commonly encountered nucleotides at each of these positions. In most prokaryotic promoters, the actual sequence is not TATAAT.

Figure 13.11: In bacterial promoters, consensus sequences are found upstream of the start site, approximately at positions −10 and −35.

366

Another consensus sequence common to most bacterial promoters is TTGACA, which lies approximately 35 nucleotides upstream of the start site and is termed the −35 consensus sequence (see Figure 13.11). The nucleotides on either side of the −10 and −35 consensus sequences and those between them vary greatly from promoter to promoter, suggesting that these nucleotides are not very important in promoter recognition.

The function of these consensus sequences in bacterial promoters has been studied by inducing mutations at various positions within the consensus sequences and observing the effect of the changes on transcription. The results of these studies reveal that most base substitutions within the −10 and −35 consensus sequences reduce the rate of transcription; these substitutions are termed down mutations because they slow down the rate of transcription. Occasionally, a particular change in a consensus sequence increases the rate of transcription; such a change is called an up mutation.

As mentioned earlier, the sigma factor associates with the core enzyme (Figure 13.12a) to form a holoenzyme, which binds to the −35 and −10 consensus sequences in the DNA promoter (Figure 13.12b). Although it binds only the nucleotides of consensus sequences, the enzyme extends from −50 to +20 when bound to the promoter. The holoenzyme initially binds weakly to the promoter but then undergoes a change in structure that allows it to bind more tightly and unwind the double-stranded DNA (Figure 13.12c). Unwinding begins within the −10 consensus sequence and extends downstream for about 14 nucleotides, including the start site (from nucleotides −12 to +2).

Figure 13.12: Transcription in bacteria is carried out by RNA polymerase, which must bind to the sigma factor to initiate transcription.

Some bacterial promoters contain a third consensus sequence that also takes part in the initiation of transcription. Called the upstream element, this sequence contains a number of A–T pairs and is found at about −40 to −60. A number of proteins may bind to sequences in and near the promoter; some stimulate the rate of transcription and others repress it. We will consider these proteins, which regulate gene expression, in Chapter 16. TRY PROBLEM 24

CONCEPTS

A promoter is a DNA sequence that is adjacent to a gene and required for transcription. Promoters contain short consensus sequences that are important in the initiation of transcription.

CONCEPT CHECK 5

What binds to the −10 consensus sequence found in most bacterial promoters?

  1. The holoenzyme (core enzyme + sigma)
  2. The sigma factor alone
  3. The core enzyme alone
  4. mRNA

Initial RNA Synthesis

After the holoenzyme has attached to the promoter, RNA polymerase is positioned over the start site for transcription (at position +1) and has unwound the DNA to produce a single-stranded template. The orientation and spacing of consensus sequences on a DNA strand determine which strand will be the template for transcription and thereby determine the direction of transcription.

The position of the start site is determined not by the sequences located there but by the location of the consensus sequences, which positions RNA polymerase so that the enzyme’s active site is aligned for the initiation of transcription at +1. If the consensus sequences are artificially moved upstream or downstream, the location of the starting point of transcription correspondingly changes.

To begin the synthesis of an RNA molecule, RNA polymerase pairs the base on a ribonucleoside triphosphate with its complementary base at the start site on the DNA template strand (Figure 13.12d). No primer is required to initiate the synthesis of the 5′ end of the RNA molecule. Two of the three phosphate groups are cleaved from the ribonucleoside triphosphate as the nucleotide is added to the 3′ end of the growing RNA molecule. However, because the 5′ end of the first ribonucleoside triphosphate does not take part in the formation of a phosphodiester bond, all three of its phosphate groups remain. An RNA molecule therefore possesses, at least initially, three phosphate groups at its 5′ end (Figure 13.12e).

367

Often, in the course of initiation, RNA polymerase repeatedly generates and releases short transcripts, from 2 to 6 nucleotides in length, while still bound to the promoter. This process, termed abortive initiation, occurs in both prokaryotes and eukaryotes. After several abortive attempts, the polymerase synthesizes an RNA molecule from 9 to 12 nucleotides in length, which allows the RNA polymerase to transition to the elongation stage.

Elongation

At the end of initiation, RNA polymerase undergoes a change in conformation (shape) and thereafter is no longer able to bind to the consensus sequences in the promoter. This change allows the polymerase to escape from the promoter and begin transcribing downstream. The sigma subunit is usually released after initiation, although some populations of RNA polymerase may retain sigma throughout elongation.

As it moves downstream along the template, RNA polymerase progressively unwinds the DNA at the leading (downstream) edge of the transcription bubble, joining nucleotides to the RNA molecule according to the sequence on the template, and rewinds the DNA at the trailing (upstream) edge of the bubble. In bacterial cells at 37°C, about 40 nucleotides are added per second. This rate of RNA synthesis is much lower than that of DNA synthesis, which is 1000 to 2000 nucleotides per second in bacterial cells.

368

The Transcription Bubble

Transcription takes place within a short stretch of about 18 nucleotides of unwound DNA−the transcription bubble. Within this region, RNA is continuously synthesized, with single-stranded DNA used as a template. About 8 nucleotides of newly synthesized RNA are paired with the DNA-template nucleotides at any one time. As the transcription apparatus moves down the DNA template, it generates positive supercoiling ahead of the transcription bubble and negative supercoiling behind it. Topoisomerase enzymes probably relieve the stress associated with the unwinding and rewinding of DNA in transcription, as they do in DNA replication.

Transcriptional Pausing

A number of features of RNA or DNA, such as secondary structures, specific sequences, or the presence of nucleosomes cause RNA polymerase to pause the elongation stage of transcription. Pauses often are caused by backtracking−when the RNA polymerase slides backward along the DNA template strand. Backtracking disengages the 3′ OH group of the RNA molecule from the active site of RNA polymerase and temporarily halts further RNA synthesis. Cells use several mechanisms to minimize backtracking, including proteins that cleave the backtracked RNA in the active site, generating a new 3′OH to which new nucleotides can then be added. In bacterial cells, translation of mRNA by ribosomes closely follows transcription (see Chapter 15) and the presence of ribosomes moving along the mRNA in a 5′→3′ direction also prevents backtracking of the RNA polymerase at the 3′ end of the mRNA.

Transitory pauses in transcription are important in the coordination of transcription and translation in bacteria, as well as in the coordination of RNA processing in eukaryotes. Pausing also affects the rates of RNA synthesis. Sometimes a pause may be stabilized by sequences in the DNA that ultimately lead to the termination of transcription (see the next section on termination).

Accuracy of Transcription

Although RNA polymerase is quite accurate in incorporating nucleotides into the growing RNA chain, errors do occasionally arise. Research has demonstrated that RNA polymerase is capable of a type of proofreading in the course of transcription. When RNA polymerase incorporates a nucleotide that does not match the DNA template, it backs up and cleaves the last two nucleotides (including the misincorporated nucleotide) from the growing RNA chain. RNA polymerase then proceeds forward, transcribing the DNA template again.

CONCEPTS

Transcription is initiated at the start site, which, in bacterial cells, is set by the binding of RNA polymerase to the consensus sequences of the promoter. No primer is required. Transcription takes place within the transcription bubble. DNA is unwound ahead of the bubble and rewound behind it. There are frequent pauses in the process of transcription.

Termination

RNA polymerase adds nucleotides to the 3′ end of the growing RNA molecule until it transcribes a terminator. Most terminators are found upstream of the site at which termination actually takes place. Transcription therefore does not suddenly stop when polymerase reaches a terminator, as does a car stopping at a stop sign. Rather, transcription stops after the terminator has been transcribed, like a car that stops only after running over a speed bump. At the terminator, several overlapping events are needed to bring an end to transcription: RNA polymerase must stop synthesizing RNA, the RNA molecule must be released from RNA polymerase, the newly made RNA molecule must dissociate fully from the DNA, and RNA polymerase must detach from the DNA template.

Bacterial cells possess two major types of terminators. Rho-dependent terminators are able to cause the termination of transcription only in the presence of an ancillary protein called the rho factor. Rho-independent terminators (also known as intrinsic terminators) are able to cause the end of transcription in the absence of rho.

Rho-Dependent Terminators

Rho-dependent terminators have two features. The first is the terminator itself, which consists of DNA sequences that cause the RNA polymerase to pause. The second feature is a DNA sequence that encodes a stretch of RNA upstream of the terminator that is usually rich in cytosine nucleotides and devoid of any secondary structures. This sequence is called the rho utilization (rut) site; it serves as a binding site for the rho protein. Once rho binds to the RNA, it moves toward its 3′ end, following the RNA polymerase (Figure 13.13). When RNA polymerase encounters the terminator, it pauses, allowing rho to catch up. The rho protein has helicase activity, which it uses to unwind the RNA–DNA hybrid in the transcription bubble, bringing transcription to an end.

Figure 13.13: The termination of transcription in some bacterial genes requires the presence of the rho protein.

Rho-Independent Terminators

Rho-independent terminators, which make up about 50% of all terminators in prokaryotes, have two common features. First, they contain inverted repeats, which are sequences of nucleotides on one strand that are inverted and complementary. When inverted repeats have been transcribed into RNA, a hairpin secondary structure forms (Figure 13.14). Second, in rho-independent terminators, a string of seven to nine adenine nucleotides follows the second inverted repeat in the template DNA. Their transcription produces a string of uracil nucleotides after the hairpin in the transcribed RNA.

Figure 13.14: Rho-independent termination in bacteria is a multistep process.

The string of uracils in the RNA molecule causes the RNA polymerase to pause, allowing time for the hairpin structure to form. Evidence suggests that the formation of the hairpin destablizes the DNA–RNA pairing, causing the RNA molecule to separate from its DNA template. Separation may be facilitated by the adenine–uracil base pairings, which are relatively weak compared with other types of base pairings. When the RNA transcript has separated from the template, RNA synthesis can no longer continue (see Figure 13.13). TRY PROBLEM 29

369

Polycistronic MRNA

In bacteria, a group of genes is often transcribed into a single RNA molecule, which is termed a polycistronic mRNA. Thus, polycistronic RNA is produced when a single terminator is present at the end of a group of several genes that are transcribed together, instead of each gene having its own terminator. Polycistronic mRNA does occur in some eukaryotes such as Caenorhabditis elegans, but it is uncommon. You can view the process of transcription, including initiation, elongation, and termination in Animation 13.1. The animation shows how the different parts of the transcriptional unit interact to bring about the complete synthesis of an RNA molecule.

CONCEPTS

Transcription ends after RNA polymerase transcribes a terminator. Bacterial cells possess two types of terminator: a rho-independent terminator, which RNA polymerase can recognize by itself; and a rho-dependent terminator, which RNA polymerase can recognize only with the help of the rho protein.

CONCEPT CHECK 6

What characteristics are most commonly found in rho-independent terminators?

CONNECTING CONCEPTS: The Basic Rules of Transcription

Before we examine the process of eukaryotic transcription, let’s summarize some of the general principles of bacterial transcription.

  1. Transcription is a selective process; only certain parts of the DNA are transcribed at any one time.
  2. RNA is transcribed from single-stranded DNA. Within a gene, only one of the two DNA strands–the template strand–is usually copied into RNA.
  3. Ribonucleoside triphosphates are used as the substrates in RNA synthesis. Two phosphate groups are cleaved from a ribonucleoside triphosphate, and the resulting nucleotide is joined to the 3′-OH group of the growing RNA strand.
  4. RNA molecules are antiparallel and complementary to the DNA template strand. Transcription is always in the 5′→3 direction, meaning that the RNA molecule grows at the 3′ end.
  5. Transcription depends on RNA polymerase–a complex, multimeric enzyme. RNA polymerase consists of a core enzyme, which is capable of synthesizing RNA, and other subunits that may join transiently to perform additional functions.
  6. A sigma factor enables the core enzyme of RNA polymerase to bind to a promoter and initiate transcription.
  7. Promoters contain short sequences crucial in the binding of RNA polymerase to DNA; these consensus sequences are interspersed with nucleotides that play no known role in transcription.
  8. RNA polymerase binds to DNA at a promoter, begins transcribing at the start site of the gene, and ends transcription after a terminator has been transcribed.
  9. Topoisomerase enzymes remove supercoiling that develops ahead and behind the transcription bubble as the DNA is unwound and rewound during transcription.