The expression of eukaryotic genes is modulated by combinations of transcription factors, and when some of these factors are common to the regulation of multiple genes, the regulation is called combinatorial control. We learned in Chapter 20 that different bacterial genes driving sugar metabolism use a common transcription activator, cAMP receptor protein (CRP). CRP is employed in regulation of the operons involved in the metabolism of lactose and galactose, as well as other sugars. This is an example of combinatorial control.
Eukaryotes make much more extensive use of combinatorial control than do bacteria. First of all, as we have seen, eukaryotes generally require many regulatory proteins at any given promoter, increasing the combinatorial possibilities severalfold. Indeed, analysis of genome sequences reveals the use of greater numbers of transcription factors as genome size and complexity increase. For example, yeast are thought to use about 300 transcription factors, Caenorhabditis elegans and D. melanogaster more than 1,000, and humans more than 3,000. Although the number of transcription factors increases with the number of genes, there are still many fewer factors than there are genes to be regulated. Somehow, different genes must use the same transcription factors, but in different ways, to achieve activation. Given the increasing complexity of promoter sequences in more complex genomes and the greater number of transcription factors, combinatorial control allows higher eukaryotes to achieve exquisite specificity in gene regulation.
We begin with the relatively simple combinatorial control system that regulates the yeast GAL genes, driving the metabolism of galactose. The mechanism behind galactose metabolism is one of the best-
The enzymes required for importing and metabolizing galactose in yeast are encoded by GAL genes scattered over several chromosomes. Yeast cells have no operons like those in bacteria, and each of the GAL genes is transcribed separately. However, all the GAL genes have similar promoters and are regulated coordinately by a common set of proteins. The promoters for the GAL genes consist of the TATA box and an upstream activator sequence, which for each GAL gene is composed of one or more sequences denoted UASGAL. Each UASGAL site is recognized by a DNA-
Like the bacterial lac operon, the yeast GAL genes require more than just one protein (Gal4p) for activation. Control of gene expression by galactose depends on three proteins: the transcription activator Gal4p, the inhibitor Gal80p, and the ligand sensor Gal3p (Figure 21-9). Gal4p binds the 17mer UASGAL sites and, left to its own devices, would activate gene expression at GAL promoters. However, at low galactose concentrations, Gal80p binds to Gal4p and blocks its transcription-
737
Glucose is the preferred carbon source for yeast, as it is for bacteria. When glucose is present, most of the GAL genes are repressed—
738
Regulatory DNA sequences, such as the binding site for Gal4p in yeast, can be identified by sequence comparisons of genes that code for proteins of the same metabolic pathway. The Gal4p-
We now know that the GAL genes are activated by the protein Gal4p, which recognizes UASGAL. Early experiments demonstrated that Gal4p binds UASGAL and functions as a transcription activator. Genetic studies revealed that a single gene, when mutated, results in loss of activation of all GAL genes. These results suggested that this single gene, GAL4, was a master regulator, much like the bacterial CRP protein. GAL4 was isolated by transforming a yeast genomic library into GAL4-mutant cells and selecting for colonies in which the GAL genes were again activated in the presence of galactose. GAL4 was then cloned into an E. coli expression vector, and Gal4p was purified (see Chapter 7 for these cloning methods).
The technique of deletion analysis revealed the modular architecture of Gal4p, a structure now known to be common among many bacterial and eukaryotic transcription activators. In deletion analysis, nucleases or restriction enzymes are used to selectively delete pieces of DNA from a specified gene. The truncated protein product of this gene can be purified and tested for activity in vitro, or tested for function in vivo using a reporter assay. Studies such as these were performed with deletion constructs of Gal4p. DNA binding of the truncated proteins was measured in vitro with electrophoretic mobility shift assays, and the ability of the truncated proteins to activate transcription was tested in vivo with a reporter gene assay. In the reporter assay, deletion constructs of GAL4 were transferred into GAL4-mutant yeast cells containing a plasmid with the bacterial lacZ reporter gene, driven by a typical GAL promoter with a UASGAL sequence (Figure 3a). The ability of each Gal4p-
The in vitro DNA-
The findings suggested that the two activities inherent in Gal4p require 260 or fewer residues: 74 at the N-
Clearly, the ability of Gal4p to activate transcription is the result of two distinct and separable domains. Similar results were obtained with other transcription activators from several different eukaryotes. Furthermore, examination of some transcription activators showed that the region between the two functional domains is highly sensitive to proteases, suggesting that the two domains are linked by sections of polypeptide that are open and flexible. These experiments gave rise to a model for some transcription activators, with two functional domains joined by a flexible linker (Figure 3c, d). The flexible region may help loosen the geometric constraints imposed by the DNA loop that forms between the transcription activator at an upstream binding site and the proteins it binds at the distant promoter. That the DNA-
Saccharomyces cerevisiae (baker’s yeast) can grow as either diploid or haploid cells, both of which reproduce by mitosis (see the Model Organisms Appendix). The diploid cells contain two copies of each of the four yeast chromosomes, and haploid cells contain one copy of each. When stressed by starvation, diploid cells can undergo meiosis to produce four haploid spores, two each of the mating types a and α. Haploid cells of the a mating type (a cells) can mate only with α haploids (α cells), and vice versa; thus, haploid cells display a simple sexual differentiation that is readily distinguishable when tested for mating ability.
Mating type is determined by the allele present at a single genetic locus, MAT. The identity of the allele at the MAT locus can switch as often as every cell division cycle. The mating-
739
The transcriptional activation and repression of genes in each mating type is an example of combinatorial control, because control is achieved by combinations of regulators, at least one of which is common to the different cell types. In addition to the presence or absence of the a1, α1, and α2 proteins, specific activation and repression also involves Mcm1, expressed by both haploid cell types, as well as by diploid cells. In a cells, Mcm1 binds the promoters of a-specific genes and activates transcription. The genes specific to α cells are turned off in a cells, because the α1 activator is not present (see Figure 21-11a). In α cells, Mcm1 and α1 interact to activate α-specific gene transcription, while α2 (in association with Mcm1) represses transcription of a-specific genes (see Figure 21-11b).
740
There are also genes specific to both haploid states, but on mating to produce a diploid cell, the haploid-
Like their bacterial counterparts, most eukaryotic transcription factors bind to DNA as homodimers. However, several types of eukaryotic transcription factors can form heterodimers of two different members of a family of similarly structured proteins, creating a larger number of functional transcription factors from a smaller number of individual proteins. For example, three possible dimers can form from just two similarly structured proteins: two homodimers and one heterodimer; a hypothetical family of four different but structurally related proteins could form up to 10 different dimeric species (Figure 21-12).
An example of proteins that behave in this fashion is the mammalian AP-
The protein-
741
This differential DNA binding, depending on the composition of the AP-
A more complicated example of combinatorial control can be seen in body plan development in the fruit fly, D. melanogaster. Before it is released to become fertilized, the developing oocyte is surrounded by cells called nurse cells. The nurse cells secrete mRNAs encoding various transcription factors into the egg at specific locations, establishing concentration gradients of mRNA for the different transcription factors within the egg. During early embryonic development the nuclei divide quickly, producing 3,000 to 6,000 nuclei before plasma membranes form to delineate individual cells. When plasma membranes do form, the newly formed cells trap the specific mRNAs present at that particular position in the embryo. Each new cell thus produces a unique complement of transcription factors that act in a combinatorial fashion to express different proteins in the early embryo.
742
An example of combinatorial control by these unevenly distributed transcription factors is regulation of the eve gene, which produces a protein called even-
Expression of eve is controlled by the concentrations of four proteins translated from the original mRNAs deposited in the developing oocyte by the nurse cells. Two of these four proteins, Bicoid and Hunchback, are activators; the other two, Giant and Krüppel, are repressors. Different gradients of the mRNAs for these activators and repressors, established by the nurse cells, result in unique ratios of the four regulatory proteins in nearly every cell of the embryo. Expression of even-
The eve gene has five different enhancers, each with a complex array of binding sites for transcription activators and repressors (Figure 21-15a). Only one enhancer needs to be active for eve to be expressed in a given cell. But if eve is to be expressed normally, all five enhancers need to be active (albeit in different cells). Each enhancer is activated by a different combination of transcription factors. Some activator and repressor sites overlap and are controlled by competition, whereas some repressor sites are distinct from activator sites and repress the gene at a distance (Figure 21-15b). Seven stripes of even-
743
The enhancer that activates eve expression in stripe 2 has been extensively studied in Michael Levine’s laboratory. This enhancer is 500 bp long and contains binding sites for both repressors and activators (see Figure 21-15b). Both activators, Bicoid and Hunchback, must bind to their sites for gene expression to occur. Some binding sites for these activators overlap repressor-
These examples of combinatorial control of transcription illustrate a central mechanism by which eukaryotic cells govern gene expression. Through the use of a relatively small number of regulatory proteins in each case, many different genes can be regulated either in concert or differentially, depending on the immediate needs of the cell. In this way, cells can respond quickly and appropriately to changing environmental conditions or to developmental requirements, within the context of a tissue or an entire organism.
Eukaryotic transcription activators such as Gal4p have DNA-
Eukaryotes make greater use of combinatorial control of gene expression than do bacteria. In combinatorial control, the same transcription factor is used in the regulation of more than one gene.
Combinatorial control can be achieved in a variety of ways. Some transcription factors are formed from combinations of two different subunits that form heterodimers, each of which has different strengths as an activator. Or a gene has several enhancers, each of which uses a different combination of transcription factors.
Mating-
Body plan organization in D. melanogaster uses gradients of mRNAs for different transcription factors in the developing embryo. Different concentrations of transcription activators and repressors control where the gene eve is activated, producing seven stripes that influence cell differentiation.