15.3 TRANSCRIPTION IN EUKARYOTES

In eukaryotic cells, three distinct RNA polymerases—Pol I, II, and III—carry out DNA-dependent synthesis of RNA. Although the properties of these polymerases resemble those of bacterial RNA polymerase in many ways, the eukaryotic polymerases require many additional proteins, called transcription factors, to begin efficient transcription at promoter sequences. These factors help assemble transcription complexes on chromatin, the compacted form of DNA that makes up eukaryotic genomes. Like bacterial sigma factors, each eukaryotic transcription factor binds to a specific promoter sequence and to a particular RNA polymerase, bridging the two to initiate transcription. Using a variety of general transcription factors, eukaryotic cells promote the transcription of many sets of genes under varying conditions (Figure 15-18). Specific transcription factors bind DNA at a long distance upstream from the promoter, at sequences known as enhancers, and can stimulate or repress transcription in various ways. Transcription factors, both general and specific, have important roles in gene regulation and cell development. Indeed, recent studies show that differentiated cells, once believed to be committed to a particular cell type, can be converted to another cell type simply by manipulating the expression of transcription factors (Highlight 15-2). Transcription regulation proteins, including transcription factors, and the DNA sites to which they bind, including enhancers, are discussed in detail in Chapters 1922.

Figure 15-18: Multiple levels of control of gene expression. Cells respond to external signals (cell-extrinsic), then to intracellular signaling pathways (cell-intrinsic), leading to effects on the transcription of specific genes (allele-intrinsic).

HIGHLIGHT 15-2 MEDICINE: Using Transcription Factors to Reprogram Cells

Patterns of gene transcription largely control how cells develop into specific cell types. This process is of great importance in medicine, because the possibility of reprogramming cells to carry out specific functions could revolutionize the treatment of patients with degenerative diseases. Experimental attempts to reprogram cells began several decades ago with the discovery that engineering of an oocyte (egg cell) to contain the nucleus of an adult cell can cause the nucleus to revert to an undifferentiated state. This process, called somatic cell nuclear transfer (SCNT), can produce an embryo and embryonic stem cells with the genetic makeup of an adult cell. Presumably, these results come about through the reprogramming of transcription in the composite cells.

This idea was tested and validated in 2006, when researchers found that fibroblasts can be induced to undergo a dramatic cell-fate reversal to an undifferentiated state, becoming what are known as induced pluripotent stem cells (iPS cells), by transiently expressing four master-regulatory transcription factors in the fibroblasts. The next step was to see whether fibroblasts might be more generally susceptible to reprogramming into different kinds of cells—if the right set of transcription factors could be identified.

To test this possibility, Marius Wernig and his colleagues at Stanford University set out to convert mouse fibroblasts into neurons. Reasoning that multiple transcription factors were probably necessary to reprogram fibroblasts to a neuronal fate, the researchers cloned 19 genes that encode transcription factors expressed specifically in neural tissues or function during neural development. The genes were cloned into lentiviruses, viral vectors that could be used to introduce the genes into mouse fibroblasts by infection. To detect changes in cell fate, the researchers used fibroblasts derived from mouse embryos and tail tips of newborn or adult mice that had been genetically altered to express a green fluorescent protein marker when the gene for the protein Tau was turned on. Because the Tau gene is specifically expressed in neurons, cells that had acquired at least this property of neurons—transcription of the Tau gene—could be easily identified.

When all 19 of the candidate transcription factors were introduced into the fibroblasts, some of the cells turned green. By a process of elimination, the researchers eventually found that a combination of only three transcription factors was sufficient to convert fibroblasts into neurons (Figure 1). These factors—Ascl1, Brn2, and Myt1l (or Zic)—caused cells to express a variety of neuronal markers and become capable of firing action potentials, a basic function of neurons. Furthermore, when cultured together with bona fide mouse neuronal cells, the reprogrammed cells received both excitatory and inhibitory synaptic connections from the mouse neurons, and could form synapses with each other.

FIGURE 1 An embryonic stem (ES) cell has the potential to develop into various cell types (solid black arrows). Activation of specific transcription factors can convert one differentiated cell type to another (red arrows) or even to an undifferentiated state (dashed arrow). Mesoderm, endoderm, and ectoderm are the three embryonic layers from which all cells and tissues develop; astroglia are non-neuronal cells of nerve tissue.

Beyond its implications for understanding transcriptional activation and regulation, this discovery offers the intriguing possibility of creating cell types at will. If such transcription-based reprogramming proves feasible in human cells, it could be used to generate neurons that mimic particular disease states for use in drug development. Researchers are now eager to find out how many different cell types can be produced by activating distinct combinations of lineage-specific transcription factors.

Because transcription is a fundamental process in all cells, it is not surprising that some eukaryotic RNA polymerase subunits are homologous to those of bacterial polymerase. Furthermore, some subunits are common to all three of the eukaryotic polymerases (see Table 15-1). Relative to bacteria, eukaryotes require additional factors to help RNA polymerases find and access promoters in the cell nucleus. This is because eukaryotic DNA is packaged into chromatin through the formation of nucleosomes (see Chapter 10). In addition, the sheer size of eukaryotic genomes, and the large number of promoters to be sorted through, probably requires additional transcription machinery.

We begin with a brief discussion of all three eukaryotic polymerases and their promoters, and then focus on Pol II transcription. As the polymerase responsible for transcribing the genes that encode proteins, Pol II is the most extensively studied of the three eukaryotic polymerases.

Eukaryotic Polymerases Recognize Characteristic Promoters

Each of the three types of RNA polymerase that make up the eukaryotic transcription machinery transcribes only certain classes of genes, and thus each type binds to specific and distinct promoter sequences. Pol I binds to a single type of promoter that controls the expression of the pre-ribosomal RNA (pre-rRNA) transcript, from which rRNAs are derived. Pol II, which synthesizes mRNAs, microRNAs, and some other noncoding RNAs, can recognize thousands of promoters that vary greatly in sequence. Pol III recognizes well-characterized promoter sequences for tRNAs, the 5S rRNA, and some other small regulatory RNAs, sequences that in many cases are located within the transcribed region itself rather than in more conventional locations upstream from the RNA start site.

Although each polymerase works with its own unique set of transcription factors, all three types use a factor called the TATA-binding protein (TBP). This protein, so-named because of its binding to a 5′-TATAAA sequence (known as the TATA box) near position −30, plays a major role in transcription initiation. Genomic sequencing studies have shown that only about a quarter of human genes include a TATA box in the core promoter, the region responsible for recruiting the essential transcription machinery. Nonetheless, TBP is used for transcription initiation of all genes, and in most of those that lack a TATA box, TBP is recruited to the gene through proteins called TBP-associated factors (TAFs) that recognize other promoter sequences. A summary of the eukaryotic polymerases and the types of RNA produced by each, along with their promoter elements, is given in Table 15-3.

Figure 15-3: Eukaryotic RNA Polymerases and Promoter Elements

538

RNA Polymerase I Promoters Synthesis of pre-rRNA accounts for 80% of all the transcription in eukaryotic cells (as measured for the yeast S. cerevisiae). The precursor rRNA transcript produced by mammalian Pol I is processed into the mature 5.8S, 18S, and 28S rRNAs, which, together with the 5S rRNA transcribed by Pol III, are the major catalytic and architectural components of the ribosome (discussed in Chapter 18). Transcription of rRNA genes, which occurs in the nucleolus, begins with the recruitment and assembly of Pol I and transcription factors into a multiprotein complex at the rRNA gene promoter. The promoter includes a core sequence, essential for accurate transcription initiation, and an upstream control element (UCE), located 100 to 150 bp upstream from the transcription start site (Figure 15-19).

Figure 15-19: The Pol I promoter. The upstream binding factor (UBF) binds to the core sequence and the upstream control element (UCE). SL1, a protein complex that includes the TATA-binding protein, binds to UBF and Pol I, promoting transcription initiation.

539

The number of rRNA genes varies among organisms. Pol I promoter sequences also vary, but within a species, all Pol I promoters are the same. Low levels of transcription can be observed in the presence of a preinitiation complex comprising Pol I and selectivity factor 1 (SL1), which is a complex of TBP and three TAFs. Higher levels of transcription require, in addition to Pol I and SL1, an upstream binding factor (UBF). UBF binds to both the UCE and the core promoter and to SL1, stabilizing the complex with Pol I and helping recruit the polymerase to the promoter.

RNA Polymerase II Promoters Many Pol II promoters share certain sequence features, including a TATA box near −30 and an initiator sequence (Inr) near the RNA start site at +1 (Figure 15-20). Pol II promoters also sometimes include a sequence upstream from the TATA box, called a TFIIB recognition element (BRE), and a sequence downstream from the initiator, the downstream promoter element (DPE). These sequences comprise the core promoter. Other sequences are also needed for efficient Pol II recognition and transcription in the cell, such as upstream promoter elements and enhancers. These regulatory sequences, which can be located many thousands of base pairs away from the promoter they influence, bind a variety of specific transcription factors that either activate or repress transcription, depending on various stimuli. The proteins that bind to the elements in the promoter are discussed shortly.

Figure 15-20: The Pol II core promoter. The TATA box, initiator sequence (Inr), or other sequence elements recognized by proteins that bind to the polymerase are required for transcription by Pol II. The TFIIB recognition element (BRE) and downstream promoter element (DPE) may also be involved in initiation.

KEY CONVENTION

The nomenclature for transcription factors indicates which RNA polymerase is involved. TFII is a transcription factor for RNA polymerase II, and TFIII is a transcription factor for RNA polymerase III. Individual factors are distinguished by an appended A, B, C, and so on (e.g., TFIIA, TFIIIB).

540

RNA Polymerase III Promoters Pol III is the largest RNA polymerase with the greatest number of subunits. All of its transcription products are short, untranslated RNAs, most less than 300 nucleotides long. In addition to 5S rRNA, they include tRNAs; 7SL RNA, which is required for introducing proteins into membranes as part of the signal recognition particle (see Chapter 18); and several RNAs involved in mRNA, tRNA, and rRNA processing. Perhaps reflecting their varied gene products, Pol III promoters differ in sequence and in components. The promoters of tRNA genes include two segments, Box A and Box B, located a short distance apart within the tRNA-coding sequence (Figure 15-21a). The 5S rRNA gene promoter includes Box A and Box C (Figure 15-21b). Other promoters contain the TATA box to which TBP can bind directly, just as for Pol II promoters.

Figure 15-21: Pol III promoters. Pol III promoters are found within genes. (a) The Pol III tRNA promoter uses the Box A and Box B sequence elements and is bound by transcription factors TFIIIB and TFIIIC. (b) The Pol III 5S rRNA promoter uses Box A and Box C, as well as TFIIIB, TFIIIC, and TFIIIA. Together, these factors recruit Pol III to the transcription start site.

Like the other eukaryotic polymerases, Pol III requires transcription factors. The tRNA genes require TFIIIB and TFIIIC, whereas the 5S rRNA gene requires TFIIIB, TFIIIC, and TFIIIA. Transcription of tRNA genes begins when TFIIIC binds to the promoter boxes within the gene, and then recruits TFIIIB. TFIIIB includes TBP and recognizes the DNA just upstream from the transcription start site. Together, these factors recruit Pol III to the transcription start site; TFIIIC is transiently displaced as the polymerase transcribes through its binding site in the DNA. In 5S rRNA transcription, TFIIIA binds to the DNA within the transcribed region and helps recruit TFIIIC.

Pol II Transcription Parallels Bacterial RNA Transcription

Pol II–catalyzed transcription is responsible for producing all mRNAs in the eukaryotic cell, as well as transcripts, such as microRNAs, that can base-pair with mRNAs and help regulate their expression (see Chapter 22). Consisting of 12 subunits, Pol II is strikingly more complex than its bacterial counterpart, yet it has remarkable similarities in structure, function, and mechanism (Figure 15-22). The largest subunit (RPB1) exhibits a high degree of homology to the β′ subunit of bacterial RNA polymerase. Another subunit (RPB2) is structurally similar to the bacterial β subunit, and two others (RPB3 and RPB11) show some structural homology to the two bacterial α subunits (see Table 15-1). Pol II must function with genomes that have multiple chromosomes and with DNA molecules more elaborately packaged than those in bacteria. The need for protein-protein interactions with the numerous other protein factors required to navigate this labyrinth largely accounts for the added complexity of Pol II and the other eukaryotic polymerases.

Figure 15-22: Bacterial RNA polymerase and eukaryotic Pol II structural elements. Although Pol II has more subunits with additional components, it has obvious structural similarities to bacterial RNA polymerase. The numbers on the Pol II subunits indicate RPB1, RPB2, and so forth.

An overview of the Pol II transcription complex is shown in Figure 15-23. Playing a role much like that of sigma factors in helping bacterial RNA polymerase recognize and bind promoter sequences, general transcription factors associate with promoter DNA and recruit Pol II to form a preinitiation complex (Figure 15-23a, step 1). The preinitiation complex is converted to an initiation complex by unwinding the DNA (step 2). During initiation (step 3), the C-terminal domain (CTD) of Pol II is phosphorylated and some transcription factors are released (Figure 15-23b). Elongation (step 4) proceeds as in bacteria. Transcription is terminated (step 5) and the Pol II CTD is dephosphorylated. Each step is associated with characteristic proteins.

Figure 15-23: Transcription at Pol II promoters. (a) The phases of transcription by Pol II—assembly, initiation, elongation, and termination—are associated with characteristic proteins, as described in the text. The ordered assembly and dissociation of these factors drives the process forward. (b) After association of transcription factors with Pol II on DNA to form an open initiation complex, some factors dissociate to enable transcription elongation, termination, and recycling.

Transcription Factors Play Specific Roles in the Transcription Process

Paul Sigler, 1934–2000
Stephen Burley

The transcription initiation mechanism has been most extensively studied for Pol II. Recruitment begins with binding of the TATA box by TFIID. Like many transcription factors, TFIID is a multiprotein complex, which includes TBP and TAFs. The TAFs fine-tune TFIID by changing the affinity of TBP for DNA, helping the transcription factor bind certain promoters. Once the TBP-DNA interaction is stabilized, other transcription factors, and Pol II itself, can stably associate with the promoter to form the preinitiation complex.

541

542

The discovery of TBP and its importance as a general transcription factor required for all transcription by Pol II raised questions about how and why it binds so specifically to the TATA element. This mystery of the transcription initiation process was solved when the research groups of Paul Sigler and Stephen Burley independently determined the molecular structure of TBP bound to DNA, using x-ray crystallography. The structure revealed that TBP sits on the DNA double helix much like a saddle, with an extended β sheet and loop “stirrups” in contact with the minor groove of the TATA box sequence (Figure 15-24). This unconventional mode of DNA recognition—most DNA-binding proteins recognize DNA by inserting α helices into the major groove—bends the DNA by positioning two pairs of Phe residue side chains between base pairs at each end of the recognition sequence. The bending opens and widens the minor groove, enabling hydrogen bonding between protein side chains and the minor-groove edges of the DNA bases. The observed helical bending explains why A=T base pairs are favored in the recognition sequence: they are more easily distorted to allow opening of the minor groove. Because TBP is used by all three classes of eukaryotic polymerases, a similar mechanism may account for promoter recognition in all cases.

Figure 15-24: The crystal structure of a TBP-DNA complex. TATA-binding protein (TBP) bends the TATA box sequence, opening the minor groove to allow sequence-specific hydrogen bonding.
Robert Roeder

In addition to TBP, Pol II requires an array of transcription factors to form an active transcription complex. The general transcription factors required at every Pol II promoter are highly conserved in all eukaryotes. Using cell-free systems pioneered by Robert Roeder at Rockefeller University, in which purified proteins were added back to the reaction mix to reconstitute active transcription complexes, it was possible to determine the identity and order of proteins needed for transcription initiation. When TBP, as part of TFIID, binds to the TATA box, it is bound in turn by the transcription factor TFIIB, which binds a larger site on the DNA than TBP alone. A third transcription factor, TFIIA, is not always essential in experiments using purified proteins to monitor transcription. However, mutant cells that lack TFIIA are not viable, showing that this factor is essential in vivo. TFIIA, when it binds to TFIID, unmasks TBP and enables it to bind efficiently to the TATA box. Pol II is bound to the complex through a mutual interaction with TFIIF. Finally, TFIIE and TFIIH bind to create the closed complex, analogous to the closed complex described for bacterial RNA polymerase (see Section 15.2). TFIIH has DNA helicase activity that promotes unwinding of the DNA near the transcription start site. This unwinding creates an open complex that is competent to begin transcription. Counting all the subunits of the various essential factors (excluding TFIIA), this minimal active assemblage has more than 30 polypeptides!

543

Once assembled on a promoter, Pol II typically produces a few abortive transcripts before entering the elongation phase of transcription—behavior similar to that of bacterial RNA polymerase. In contrast to bacterial polymerase, however, Pol II must be chemically modified by the addition of phosphate groups to its CTD to disengage from the promoter and begin elongating a transcript.

The CTD, part of the largest polymerase subunit, consists of multiple repeats of the seven amino acid sequence Tyr-Ser-Pro-Thr-Ser-Pro-Ser. In yeast and animals, the CTD mainly functions as a docking platform to recruit transcription and processing factors during the transcription cycle. CTD-associated factors have a variety of functions, including to catalyze mRNA 5′ capping and 3′-end processing, pre-mRNA splicing, and histone modification. To recruit all these different proteins, the CTD uses distinct chemical codes. Reversible phosphorylations of the serines in the second and fifth positions (Ser2 and Ser5) of the CTD repeat sequence are the primary CTD codes and are crucial for regulating transcription and the binding of mRNA processing factors. The protein enzymes responsible for these phosphorylations, the CTD kinases, are conserved from yeast to metazoans. To enable recruitment of other kinds of proteins, the CTD adopts additional modifications, including phosphorylation of Tyr1, Ser7, and Thr4, as well as cis-trans isomerization of Pro3 and Pro6, in the CTD repeat sequence.

TFIIE and TFIIH are released during synthesis of the initial 60 to 70 nucleotides of RNA, as Pol II enters the elongation phase of transcription. Notably, phosphorylation of Pol II CTD also influences downstream processing of the RNA transcript, providing a mechanism for coupling transcription to RNA splicing and intracellular transport (as discussed in Chapter 16).

In the elongation phase, polymerase activity is greatly enhanced by elongation factors. They suppress pausing during transcription, enhance polymerase editing of misincorporated bases by hydrolysis (as for bacterial RNA polymerase), and recruit protein complexes involved in posttranscriptional processing of the mRNA. Once the RNA transcript is completed, transcription is terminated. In eukaryotes, termination is often triggered by endonucleases that recognize and cleave specific sequences in the newly synthesized RNA, leading to disassembly and dissociation of the transcription complex. Pol II is then dephosphorylated and recycled, readying it to initiate another transcript (see Figure 15-23).

Transcription Initiation In Vivo Requires the Mediator Complex

As we have seen, the initiation and control of eukaryotic mRNA synthesis requires a large set of evolutionarily conserved general transcription factors that function at most, if not all, genes. These include initiation factors TFIIB, TFIID (which includes the TATA-binding protein, TBP), TFIIE, TFIIF, and TFIIH—which comprise the minimal set of helper proteins necessary and sufficient for in vitro selective binding and accurate transcription initiation by Pol II from core promoters. In vivo, in yeast cells, the multiprotein Mediator complex is also required for the regulated transcription of nearly all Pol II–dependent genes. Its presence also in humans implies a similar central role in Pol II–catalyzed transcription.

Mediator functions as an intermediary between specific transcription factors bound at upstream promoter elements or enhancers and the Pol II complex and general initiation factors bound at the core promoter (Figure 15-25a). First discovered and purified from yeast by Roger Kornberg and his colleagues, Mediator was found to be required for transcriptional activation by specific activators in vitro, using a reconstituted enzyme system containing purified Pol II and general initiation factors. Yeast Mediator has 20 subunits in three distinct subdomains, referred to as the head, middle, and tail modules (Figure 15-25b). An additional module, which includes a kinase enzyme complex, is associated with a subset of yeast Mediator complexes. The presence of the kinase corresponds to repression of a subset of genes, suggesting a role for Mediator in transcriptional down-regulation as well as in activation.

Figure 15-25: The Mediator complex. (a) Mediator helps to bridge distant proteins bound to enhancer sequences and Pol II and its general transcription factors, bound near the transcription start site. (b) Mediator bound to Pol II. The Mediator complex consists of 20 proteins and has three subdomains.

Human Mediator contains a set of consensus subunits similar to those in yeast. As in yeast, multiple forms of Mediator seem to function differently in the transcriptional control of different sets of genes. In particular, the kinase module can exert a repressive effect when associated with the mammalian Mediator, whereas other auxiliary proteins are associated with an activating form of Mediator.

The mechanisms by which Mediator complexes control mRNA synthesis involve direct interactions with DNA-binding transcription activators bound at upstream promoter elements and enhancers, interactions with Pol II, and interactions with one or more of the general initiation factors bound at the core promoter. Mediator supports transcriptional activation, at least in part, by increasing the rate and/or efficiency of assembly of the Pol II preinitiation complex. Mammalian Mediator complexes influence several steps during this assembly, including the recruitment of TFIID (or TBP), Pol II, and the other general initiation factors to the core promoter.

544

Termination Mechanisms Vary among RNA Polymerases

The three eukaryotic RNA polymerases use different strategies for terminating transcription, although these mechanisms have some aspects in common. The Pol III and Pol I termination pathways seem to be simpler than the Pol II pathway. Pol III terminates transcription at T-rich sequences in the DNA template located a short distance from the 3′ end of the mature RNA, assisted by just a few protein factors. Pol I terminates at a terminator site located downstream from the pre-rRNA sequence and requires terminator recognition by specific protein factors.

In contrast, Pol II termination does not occur at a conserved site or at a constant distance from the 3′ end of mature RNAs. In mammals, it takes place anywhere from a few base pairs to several kilobase pairs downstream from the 3′ end of the mature transcript. The 3′ end of the mature mRNA includes a stretch of A nucleotides, called a poly(A) tail, that is essential for translation into protein (see Chapter 18). A polyadenylation signal sequence (typically AAUAAA) is present in the primary transcript (and directly encoded by the DNA). Factors responsible for cleavage of the primary transcript bind to the AAUAAA sequence, resulting in cleavage somewhat downstream from that position. Only after this cleavage is the poly(A) tail added. Pol II termination is coupled to 3′-end processing of precursor mRNA transcripts, and the intact polyadenylation signal is necessary for termination of transcription of protein-coding genes in human and yeast cells.

Two different models have been proposed to explain how 3′-end processing contributes to Pol II transcription termination. The first, known as the allosteric or antiterminator model, proposes that transcription through the poly(A) site triggers conformational changes in the Pol II elongation complex caused by the dissociation of elongation factors and/or association of termination factors. This is analogous to the hairpin model of termination in bacteria (see Section 15.2). According to the allosteric model, these conformational changes in Pol II cause it to fall off the DNA template. The second model, the torpedo model, suggests that after mRNA synthesis is complete, Pol II remains associated with the DNA template and continues the transcription reaction to extend the 3′ end of the mRNA. Protein complexes cleave the mRNA at the polyadenylation site, producing a new 3′ end that can be recognized and extended by the enzyme poly(A) polymerase. The new 5′ end of the downstream, or residual, mRNA strand becomes a substrate for an enzyme called Xrn2, a 5′→3′ exonuclease (an exoribonuclease) that attaches to the CTD of Pol II (Figure 15-26). Xrn2 proceeds to degrade the uncapped residual RNA in the 5′→3′ direction until it reaches Pol II. Similar to the ρ factor in ρ-dependent termination in bacteria, Xrn2 triggers dissociation of Pol II by either pushing the polymerase off the DNA template or pulling the template out of the RNA polymerase.

Figure 15-26: Torpedo model for transcription termination by Pol II. The torpedo model hypothesizes that the mRNA transcript is cleaved downstream from the poly(A) addition site by the U7 snRNP. An exonuclease (Xrn2) binds the RNA remaining on the polymerase and degrades the RNA in the 5′→3′ direction, moving closer to the polymerase and eventually causing it to release the mRNA.

545

Transcription Is Coupled to DNA Repair, RNA Processing, and mRNA Transport

In eukaryotes, transcription is coupled to other activities, including the repair of damaged DNA and various kinds of RNA processing and transport events. Researchers noticed that DNA damage repair and mRNA processing and transport are more efficient for genes that are actively being transcribed. Furthermore, DNA lesions in the template strand are repaired somewhat more efficiently than lesions in the coding (nontemplate) strand. For DNA repair, these remarkable observations are explained by the alternative functions of the TFIIH subunits. Not only does TFIIH participate in forming the closed complex during assembly of a transcription complex, but some of its subunits are also essential components of the separate nucleotide excision repair complex (see Chapter 12). When Pol II transcription stalls at the site of a DNA lesion, TFIIH reassociates with the DNA and transcription machinery. TFIIH can then interact with the lesion and recruit the entire nucleotide excision repair complex. Mutations causing deletion of certain TFIIH subunits produce human diseases. Two examples are xeroderma pigmentosum, with its associated photosensitivity and tumor susceptibility, and Cockayne syndrome, which is characterized by arrested growth, photosensitivity, and neurological disorders.

Eukaryotic mRNA is processed in a variety of ways before it is shipped across the nuclear membrane to the cytoplasm for translation. We discuss the mechanisms of these processing events in Chapter 16, but it is important to note here that like DNA repair, mRNA processing is naturally linked to transcription. This is possible because some of the same proteins required for elongating RNA transcripts are also required for 5′-end processing (5′ capping) of the RNA. Because these activities are coupled, transcripts can be processed as they are synthesized.

SECTION 15.3 SUMMARY

  • The RNA polymerases of eukaryotes (Pol I, II, and III) share some structural and functional features with bacterial RNA polymerase, but they are much larger and require additional proteins—transcription factors—to begin efficient transcription at promoter sequences.

  • Pol I, II, and III recognize distinct promoter sequences and require unique sets of transcription factors, with the exception of TATA-binding protein (TBP), which is used by all three polymerases.

  • As in bacteria, transcription initiation in eukaryotes is highly regulated and includes multiple steps that lead to assembly of an active polymerase complex at a promoter. Pol II transcription (the most studied) proceeds through distinct phases of assembly, initiation, elongation, and termination.

  • In eukaryotes, the Pol II C-terminal domain must be phosphorylated before transcription can proceed from initiation to elongation.

  • Transcriptional regulation in eukaryotes is enhanced by Mediator, a large protein complex that binds simultaneously with general transcription factors associated with Pol II and specific transcription factors associated with upstream promoter elements.

    546

  • Two hypotheses for transcription termination suggest a role for mRNA sequence elements and for an exonuclease, respectively.

  • TFIIH, a eukaryotic transcription initiation factor, can start nucleotide excision repair of DNA when Pol II encounters a lesion in the template strand. Transcription and processing of mRNA are coupled, because some Pol II transcription factors are also required for pre-mRNA processing events.

UNANSWERED QUESTIONS

Many details of transcription mechanisms are known, but future challenges include discovering how, where, and when transcripts are made and how they are used in cells for functions beyond encoding and synthesizing proteins.

  1. How does RNA polymerase coordinate with other enzymes and regulators during gene expression? The pausing of RNA polymerase during transcription is thought to help the enzyme coordinate with other steps in the protein-producing pathway. How does this work, and do proteins such as RNA-modifying enzymes recognize paused transcripts as substrates? Understanding these mechanisms and how they differ in bacteria and humans will provide basic information about transcription and help define steps that could be disrupted to block bacterial growth, thus serving as good antibacterial drug targets.

  2. What is the mechanism of promoter sequence recognition? Are there other promoter sequences that haven’t yet been identified? These questions are especially relevant given the explosion in numbers of non-protein-coding RNA transcripts produced by Pol II that are now being discovered. Pol II seems to transcribe much of the human genome at low levels. How is the polymerase recruited to the DNA for this purpose? Perhaps Pol II uses its weak, nonspecific DNA-binding affinity, or perhaps it can transcribe past termination signals at some frequency. How is such transcription controlled?

  3. How exactly is transcription terminated? This is still not well understood, especially in eukaryotes. If this process were better understood, it might be possible to exploit it for therapeutic purposes, such as inducing early termination of viral transcripts.

547

RNA Polymerase Is Recruited to Promoter Sequences

Dynan, W.S., and R. Tjian. 1983. Isolation of transcription factors that discriminate between different promoters recognized by RNA polymerase II. Cell 32:669–680.

Dynan, W.S., and R. Tjian. 1983. The promoter-specific transcription factor Sp1 binds to upstream sequences in the SV40 early promoter. Cell 35:79–87.

William Dynan

One of the first experiments to demonstrate how promoters are recognized by RNA polymerase II involved separating the contents of cultured human HeLa cells into the components required for accurate gene transcription in vitro. Bill Dynan and Bob Tjian used this system to find that Sp1 is a promoter-specific transcription factor required to recruit Pol II to only certain kinds of genes. Using genes from two different mammalian viruses, the monkey virus SV40 and the human adenovirus, Dynan and Tjian found that Sp1 recruited Pol II to SV40 genes, but not to adenovirus genes (Figure 1). When SV40 and adenovirus DNA templates were present together in an in vitro transcription reaction, addition of Spl stimulated early promoter transcription of the SV40 DNA 40-fold, whereas promoter transcription of adenovirus DNA was inhibited 40%. This finding suggested that Spl is involved in promoter selection and is not merely a stimulatory general transcription factor.

FIGURE 1 Sp1 activates transcription of SV40 DNA, but not human adenovirus DNA. Purified RNA polymerase and increasing amounts of Sp1 were added to a mixture of DNA containing an SV40 promoter and an adenovirus promoter; transcripts initiated from each promoter were separated and analyzed by gel electrophoresis.

Further experiments using deletion mutants of the SV40 promoter showed that transcriptional activation by Sp1 required sequences within tandem 21 bp repeats located 70 to 110 bp upstream from the transcription initiation site. DNA footprinting revealed that DNA sequences within the 21 bp repeat region were bound by Sp1 (Figure 2). In this experiment, SV40 promoter-containing DNA was incubated with increasing amounts of a cell extract enriched with the Sp1 protein. DNase I, a nuclease, was then added to digest any DNA not protected by bound protein. As the protein concentration increased, a pronounced region of the DNA around the 21 bp repeat became resistant to DNase I digestion, revealing the “footprint” left by the binding of Sp1 to the DNA.

FIGURE 2 Sp1 leaves its footprint on a promoter. (a) The Sp1 footprint, seen as bands in the gel that decrease in intensity as the Sp1 concentration increases, is visible at sites flanking positions −21 and −42 with increasing concentrations of Sp1. The band that increases in intensity indicates a base pair in the DNA that becomes more susceptible to cleavage by DNase I on Sp1 binding—hinting at a change in the DNA structure that is induced by protein binding. (b) Sp1 binds SV40 DNA near positions –21 and –42.

This was an exciting result, because it indicated the presence of a specific site for Sp1 binding. Furthermore, there was a correlation between this promoter-binding activity and consequent transcription stimulation. The results suggested that Sp1 activated transcription by Pol II at the SV40 early promoter by direct binding of the Sp1 to sequences in the upstream activator sequence.

548

RNA Polymerases Are Both Fast and Slow

Neuman, K., E. Abbondanzieri, R. Landick, J. Gelles, and S.M. Block. 2003. Ubiquitous transcriptional pausing is independent of RNA polymerase backtracking. Cell 115:437–447.

Shaevitz, J.W., E.A. Abbondanzieri, R. Landick, and S.M. Block. 2003. Backtracking by single RNA polymerase molecules observed at near-base-pair resolution. Nature 426:684–687.

Molecular biologists have noticed that rather than transcribing DNA at a constant pace, an RNA polymerase hesitates at certain sites as it moves along the template. However, because the individual polymerases in a solution are not synchronized, the kinetics of pausing are difficult to study.

To circumvent this problem, Stephen Block, Bob Landick, and their coworkers chemically attached transcription elongation complexes to polystyrene beads, one polymerase to a bead. They used antibodies to attach one end of the template DNA to the surface of a microscope stage (Figure 3a), and used a laser trap to keep the bead (and RNA polymerase) in a fixed position while moving the stage (and the DNA) away, pulling the DNA taut through this constant force. They monitored the motion of the bead (and polymerase) with respect to the stage surface as the DNA was threaded through the elongation complex. This system was used to assess the force on the bead required to counteract the motion of the RNA polymerase. From these measurements, pause and arrest sites on the DNA could be mapped, and the maximal speed reached by RNA polymerase between two pause sites was measured.

FIGURE 3 Single-molecule analysis determines the velocity of an RNA polymerase along a DNA and monitors pausing. (a) Representation of the experimental setup (not to scale). Transcribing RNA polymerase with nascent RNA is chemically attached to a polystyrene bead, and the upstream end of the duplex DNA is attached through an antibody linkage to a moveable microscope stage. The bead is held by a laser trap at a predetermined position, which results in a restoring force exerted on the bead. (b) Representative record of position and velocity for a single polymerase molecule transcribing a 3,500 bp DNA template with <18 piconewtons of hindering force. Pausing occurs on multiple timescales, with distinct pauses of seconds-long duration and shorter pauses of about 1 second, as seen in the expanded portion of the trace (arrows point to times when the polymerase transitions from paused to elongating state).

Although the experiments were conducted on single molecules, thousands of recordings were made, enabling the investigators to compare individual polymerase complexes. The results showed that RNA polymerase molecules alternate between constant-velocity transcription and pausing. The velocities of individual polymerase molecules typically displayed a bimodal distribution, with one peak corresponding to the rate of transcription between pauses and a second peak, near 0 bp/s, corresponding to the pauses themselves (Figure 3b). This study of individual elongation complexes provided direct evidence that RNA polymerases have different intrinsic speeds. The coexistence of slower and faster polymerases might explain how regulatory proteins modulate the behavior of elongation complexes during transcription, increasing or decreasing overall transcription rates in response to the needs of the cell.

549