37.2 RNA Polymerase II Requires Complex Regulation

✓ 3 Describe how transcription is regulated in eukaryotes.

Figure 37.4: The TATA box. Comparisons of the sequences of more than 100 eukaryotic promoters led to the consensus sequence shown. The subscripts denote the frequency (%) of the base at that position.

The elaborate regulation of RNA polymerase II accounts for cell differentiation and development in higher organisms. Consequently, we will focus our attention on this vital enzyme. Promoters for RNA polymerase II, like those for bacterial polymerases, are generally located on the 5′ side of the start site for transcription. The most commonly recognized cis-acting element for genes transcribed by RNA polymerase II is called the TATA box on the basis of its consensus sequence (Figure 37.4). The TATA box is usually centered at ∼ −25. Note that the eukaryotic TATA box closely resembles the bacterial −10 sequence (TATAAT) but is farther from the start site. The mutation of a single base in the TATA box markedly impairs promoter activity. Thus, the precise sequence, not just a high content of AT pairs, is essential.

The TATA box is often paired with an initiator element (Inr), a sequence found at the transcriptional start site located at ∼ +1. This sequence defines the start site because the other promoter elements are at variable distances from that site. Its presence increases transcriptional activity.

A third element, the downstream core promoter element (DPE), is commonly found in conjunction with the Inr in transcripts that lack the TATA box. In contrast with the TATA box, the DPE is found downstream of the start site, at ∼ +30.

Figure 37.5: The CAAT and GC boxes. Consensus sequences for the CAAT and GC boxes of eukaryotic promoters for mRNA precursors. N signifies that any nucleotide can occupy the position.

Additional regulatory sequences are located between −40 and −150. Many promoters contain a CAAT box, and some contain a GC box (Figure 37.5). Constitutive genes (genes that are continuously expressed rather than regulated) tend to have GC boxes in their promoters. The positions of these upstream sequences vary from one promoter to another, in contrast with the location of the −35 region in bacteria. Another difference is that the CAAT box and the GC box can be effective when present on the template (antisense) strand, unlike the −35 region, which must be present on the coding (sense) strand. These differences between bacteria and eukaryotes correspond to fundamentally different mechanisms for the recognition of cis-acting elements. The −10 and −35 sequences in bacterial promoters are binding sites for RNA polymerase and its associated σ factor. In contrast, the TATA, CAAT, and GC boxes and other cis-acting elements in eukaryotic promoters are recognized by proteins other than RNA polymerase itself.

679

The Transcription Factor IID Protein Complex Initiates the Assembly of the Active Transcription Complex

Figure 37.6: Transcription initiation. Transcription factors TFIIA, B, D, E, F, and H are essential in initiating transcription by RNA polymerase II.

Cis-acting elements constitute only part of the puzzle of eukaryotic gene expression. Transcription factors that bind to these elements also are required. For example, RNA polymerase II is guided to the start site by a set of transcription factors known collectively as TFII (TF stands for transcription factor, and II refers to RNA polymerase II). Individual TFII factors are called TFIIA, TFIIB, and so on. Initiation begins with the binding of TFIID to the TATA box (Figure 37.6). These general transcription factors are found in all eukaryotes, suggesting that the fundamentals of transcription are conserved in higher organisms.

In TATA-box promoters, the key initial event is the recognition of the TATA box by the TATA-box-binding protein (TBP), a 30-kDa component of the 700-kDa TFIID complex. TBP is a saddle-shaped protein consisting of two similar domains (Figure 37.7). The TATA box of DNA binds to the concave surface of TBP, inducing large conformational changes in the bound DNA. The double helix is substantially unwound to widen its minor groove, enabling it to make extensive contact with the concave side of TBP with hydrophobic interactions.

Figure 37.7: The complex formed by the TATA-box-binding protein and DNA. The saddlelike structure of the protein sits atop a DNA fragment. Notice that the DNA is significantly unwound and bent.

TBP bound to the TATA box nucleates the formation of the preinitiation complex (PIC) (Figure 37.6). The surface of the TBP saddle provides docking sites for the binding of other components. TFIIA and TFIIB bind next, stabilizing the binding of TBP to the DNA. With the arrival of TFIIB, the complex recruits RNA polymerase, which is escorted to the promoter site by TFIIE. Binding by TFIIH completes the formation of the PIC.

TFIIH has two essential catalytic activities. First, it is an ATP-dependent helicase that unwinds the DNA as a prelude to transcription. Second, the protein is also a kinase that phosphorylates the CTD of the polymerase. In the formation of the PIC, the CTD is unphosphorylated and binds to other proteins that facilitate transcription. Phosphorylation of the CTD by TFIIH marks the transition from initiation to elongation. The phosphorylated CTD stabilizes transcription elongation by RNA polymerase II and recruits RNA-processing enzymes that act in the course of elongation (Chapter 38). Most of the factors are released before the polymerase leaves the promoter and can then participate in another round of initiation.

Enhancer Sequences Can Stimulate Transcription at Start Sites Thousands of Bases Away

The activities of many promoters in higher eukaryotes are greatly increased by another type of cis-acting element called an enhancer. Enhancer sequences have no promoter activity of their own yet can exert their stimulatory actions over distances of several thousand base pairs. They can be upstream, downstream, or even in the midst of a transcribed gene. Moreover, enhancers are effective when present on either DNA strand. Like promoter sequences, enhancers are bound by proteins called transcription activators that participate in the regulation of transcription.

680

!quickquiz! QUICK QUIZ 1

Differentiate between a promoter and an enhancer.

!clinic! CLINICAL INSIGHT: Inappropriate Enhancer Use May Cause Cancer

Enhancer sequences are important in establishing the tissue specificity of gene expression because a particular enhancer is effective only in certain cells. For example, the immunoglobulin enhancer functions in B lymphocytes but not elsewhere. Cancer can result if the relation between genes and enhancers is disrupted. In Burkitt lymphoma and B-cell leukemia, a chromosomal translocation brings the proto-oncogene myc (which encodes a transcription factor) under the control of a powerful immunoglobulin enhancer. The consequent dysregulation of the myc gene is believed to play a role in the progression of the cancer.

Multiple Transcription Factors Interact with Eukaryotic Promoters and Enhancers

The basal transcription complex described above initiates transcription at a low frequency. Additional transcription factors that bind to other sites are required to achieve a high rate of mRNA synthesis and to selectively stimulate specific genes. Indeed, many transcription factors have been isolated. These transcription factors are often expressed in a tissue-specific manner.

Figure 37.8: Mediator. A large complex of protein subunits, mediator acts as a bridge between transcription factors bearing activation domains and RNA polymerase II. These interactions help recruit and stabilize RNA polymerase II near specific genes that are then transcribed.

In contrast with those of bacterial transcription, few eukaryotic transcription factors have any effect on transcription on their own. Instead, each factor recruits other proteins to build up large complexes that interact with the transcriptional machinery to activate or repress transcription. These intermediary proteins act as a bridge between the transcription factors and the polymerase. An important target intermediary for many transcription factors is mediator, a huge complex of 25 to 30 subunits with a mass of more than 1-MDa, that joins the transcription machinery before initiation takes place. Mediator acts as a bridge between enhancer-bound activators, other transcription factors, and promoter-bound RNA polymerase II (Figure 37.8).

DID YOU KNOW?

Pluripotent cells are stem cells that can develop into any type of fetal or adult cell. Totipotent stems cells not only can develop into any fetal or adult cell type, but also can develop extraembryonic tissues and thus grow into an entire organism.

Transcription factors and other proteins that bind to regulatory sites on DNA can be regarded as passwords that cooperatively open multiple locks, giving RNA polymerase access to specific genes. A major advantage of this mode of regulation is that a given regulatory protein can have different effects, depending on what other proteins are present in the same cell. This phenomenon, called combinatorial control, is crucial to multicellular organisms that have many different cell types. A comparison between human beings and the simple roundworm Caenorhabditis elegans provides an example of the power of combinatorial control. Human beings have only one-third more genes than the worm, but we have many more cell types, and the dramatic increase in complexity without a corresponding increase in gene number may be due to a more sophisticated regulation allowed by combinatorial control.

!clinic! CLINICAL INSIGHT: Induced Pluripotent Stem Cells Can Be Generated by Introducing Four Transcription Factors into Differentiated Cells

An important application illustrating the power of transcription factors is the development of induced pluripotent stem (iPS) cells. Pluripotent stem cells, which can be derived from embryos, have the ability to differentiate into many different cell types on appropriate treatment. Recent experiments have identified just four genes, encoding transcription factors, that can induce pluripotency in already-differentiated skin cells. When these four genes are introduced into skin cells called fibroblasts, the fibroblasts de-differentiate into cells that appear to have characteristics very nearly identical with those of embryonic stem cells.

681

These iPS cells are powerful new research tools and, potentially, a new class of therapeutic agents. Ideally, a sample of a patient’s fibroblasts could be readily isolated and converted into iPS cells. These iPS cells could then be treated to differentiate into a desired cell type that could be transplanted into the patient, repairing tissue damage. Although the field of iPS-cell research is still in its very early stages, it holds great promise as a possible approach to treatment for many common and difficult-to-treat diseases.