12.2 Lessons from Yeast: The GAL System
To make use of extracellular galactose, yeast imports the sugar and converts it into a form of glucose that can be metabolized. Several genes—GAL1, GAL2, GAL7, and GAL10—in the yeast genome encode enzymes that catalyze steps in the biochemical pathway that converts galactose into glucose (Figure 12-5). Three additional genes—GAL3, GAL4, and GAL80—encode proteins that regulate the expression of the enzyme genes. Just as in the lac system of E. coli, the abundance of the sugar determines the level of gene expression in the biochemical pathway. In yeast cells growing in media lacking galactose, the GAL genes are largely silent. But, in the presence of galactose (and the absence of glucose), the GAL genes are induced. Just as for the lac operon, genetic and molecular analyses of mutants have been key to understanding how the expression of the genes in the galactose pathway is controlled.
Figure 12-5: The Gal pathway
Figure 12-5: Galactose is converted into glucose-1-phosphate in a series of steps. These steps are catalyzed by enzymes (Gal1, and so forth) encoded by the structural genes GAL1, GAL2, GAL7, and GAL10.
The key regulator of GAL gene expression is the Gal4 protein, a sequence-specific DNA-binding protein. Gal4 is perhaps the best-studied transcriptional regulatory protein in eukaryotes. The detailed dissection of its regulation and activity has been a source of several key insights into the control of transcription in eukaryotes.
Gal4 regulates multiple genes through upstream activation sequences
In the presence of galactose, the expression levels of the GAL1, GAL2, GAL7, and GAL10 genes are 1000-fold or more higher than in its absence. In GAL4 mutants, however, they remain silent. Each of these four genes has two or more Gal4-binding sites located at some distance 5′ (upstream) of its promoter. Consider the GAL10 and GAL1 genes, which are adjacent to each other and transcribed in opposite directions. Between the GAL1 transcription start site and the GAL10 transcription start site is a single 118-bp region that contains four Gal4-binding sites (Figure 12-6). Each Gal4-binding site is 17 base pairs long and is bound by one Gal4 protein dimer. There are two Gal4-binding sites upstream of the GAL2 gene as well, and another two upstream of the GAL7 gene. These binding sites are required for gene activation in vivo. If they are deleted, the genes are silent, even in the presence of galactose. These regulatory sequences are enhancers. The presence of enhancers located at a considerable linear distance from a eukaryotic gene’s promoter is typical. Because the Gal4-activated enhancers are located upstream (5′) of the genes they regulate, they are also called upstream activation sequences (UAS).
Figure 12-6: Transcriptional activator proteins bind to UAS elements in yeast
Figure 12-6: The Gal4 protein activates target genes through upstream-activating-sequence (UAS) elements. The Gal4 protein has two functional domains: a DNA-binding domain (pink square) and an activation domain (orange oval). The protein binds as a dimer to specific sequences upstream of the promoters of Gal-pathway genes. Some of the GAL genes are adjacent (GAL1, GAL10), whereas others are on different chromosomes. The GAL1 UAS element contains four Gal4-binding sites.
Yeast
Saccharomyces cerevisiae, or budding yeast, has emerged in recent years as the premier eukaryotic genetic system. Humans have grown yeast for centuries because it is an essential component of beer, bread, and wine. Yeast has many features that make it an ideal model organism. As a unicellular eukaryote, it can be grown on agar plates and, with yeast’s life cycle of just 90 minutes, large quantities of it can be cultured in liquid media. It has a very compact genome with only about 12 mega-base pairs of DNA (compared with almost 3000 mega-base pairs for humans) containing approximately 6000 genes that are distributed on 16 chromosomes. It was the first eukaryote to have its genome sequenced.
The yeast life cycle makes it very versatile for laboratory studies. Cells can be grown as either diploid or haploid. In both cases, the mother cell produces a bud containing an identical daughter cell. Diploid cells either continue to grow by budding or are induced to undergo meiosis, which produces four haploid spores held together in an ascus (also called a tetrad). Haploid spores of opposite mating type (a or α) will fuse and form a diploid. Spores of the same mating type will continue growth by budding.
The life cycle of baker’s yeast. The nuclear alleles MATa and MATα determine mating type.
Yeast has been called the E. coli of eukaryotes because of the ease of forward and reverse mutant analysis. To isolate mutants by using a forward genetic approach, haploid cells are mutagenized (with X rays, for example) and screened on plates for mutant phenotypes. This procedure is usually done by first plating cells on a rich medium on which all cells grow and by copying, or replica plating, the colonies from this master plate onto replica plates containing selective media or special growth conditions. (See also Chapter 16.) For example, temperature-sensitive mutants will grow on the master plate at the permissive temperature but not on a replica plate at a restrictive temperature. Comparison of the colonies on the master and replica plates will reveal the temperature-sensitive mutants. Using reverse genetics, scientists can also replace any yeast gene (of known or unknown function) with a mutant version (synthesized in a test tube) to understand the nature of the gene product.
Electron micrograph of budding yeast cells.
[SciMAT/Science Source.]
KEY CONCEPT
The binding of sequence-specific DNA-binding proteins to regions outside the promoters of target genes is a common feature of eukaryotic transcriptional regulation.
The Gal4 protein has separable DNA-binding and activation domains
After Gal4 is bound to the UAS element, how is gene expression induced? A distinct domain of the Gal4 protein, the activation domain, is required for regulatory activity. Thus, the Gal4 protein has at least two domains: one for DNA binding and another for activating transcription. A similar modular organization has been found to be a common feature of other DNA-binding transcription factors as well.
Figure 12-7: Transcriptional activator proteins are modular
Figure 12-7: Transcriptional activator proteins have multiple, separable domains. (a) The Gal4 protein has two domains and forms a dimer. (b) The experimental removal of the activation domain shows that DNA binding is not sufficient for gene activation. (c) Similarly, the bacterial LexA protein cannot activate transcription on its own, but, when fused to the Gal4 activation domain (d), it can activate transcription through LexA-binding sites.
The modular organization of the Gal4 protein was demonstrated in a series of simple, elegant experiments. The strategy was to test the DNA binding and gene activation of mutant forms of the protein in which parts had been either deleted or fused to other proteins. By this means, investigators could determine whether a part of the protein was necessary for a particular function. To carry out these studies, experimenters needed a simple means of assaying the expression of the enzymes encoded by the GAL genes.
The expression of GAL genes and other targets of transcription factors is typically monitored by using a reporter gene whose level of expression is easily measured. In reporter-gene constructs, the reporter gene is linked to the regulatory sequences that govern the expression of the gene being investigated. The expression of the reporter gene reflects the activity of the regulatory element being investigated. Often, the reporter gene is the lacZ gene of E. coli. LacZ is an effective reporter gene because the products of its activity are easily measured. Another common reporter gene is the gene that encodes the green fluorescent protein (GFP) of jellyfish. As its name suggests, the concentration of reporter protein is easily measured by the amount of light that it emits. To investigate the control of GAL gene expression, the coding region of one of these reporter genes and a promoter are placed downstream of a UAS element from a GAL gene. Reporter expression is then a readout of Gal4 activity in cells (Figure 12-7a).
Let’s see what happens when a form of the Gal4 protein lacking the activation domain is expressed in yeast. In this case, the binding sites of the UAS element are occupied, but no transcription is stimulated (Figure 12-7b). The same is true when other regulatory proteins lacking activation domains, such as the bacterial repressor LexA, are expressed in cells bearing reporter genes with their respective binding sites. The more interesting result is obtained when the activation domain of the Gal4 protein is grafted to the DNA-binding domain of the LexA protein; the hybrid protein now activates transcription from LexA binding sites (Figure 12-7d). Further “domain-swap” experiments have revealed that the transcriptional activation function of the Gal4 protein resides in two small regions about 50 to 100 amino acids in length. These two regions form a separable activation domain that helps recruit the transcriptional machinery to the promoter, as we will see later in this section. This highly modular arrangement of activity-regulating domains is found in many transcription factors.
KEY CONCEPT
Many eukaryotic transcriptional regulatory proteins are modular proteins, having separable domains for DNA binding, activation or repression, and interaction with other proteins.
Gal4 activity is physiologically regulated
Figure 12-8: Transcriptional activator proteins may be activated by an inducer
Figure 12-8: Gal4 activity is regulated by the Gal80 protein. (Top) In the absence of galactose, the Gal4 protein is inactive, even though it can bind to sites upstream of the GAL1 target gene. Gal4 activity is suppressed by the binding of the Gal80 protein. (Bottom) In the presence of galactose and the Gal3 protein, Gal80 undergoes a conformational change and is released, allowing the Gal4 activation domain to activate target gene transcription.
How does Gal4 become active in the presence of galactose? Key clues came from analyzing mutations in the GAL80 and GAL3 genes. In GAL80 mutants, the GAL structural genes are active even in the absence of galactose. This result suggests that the normal function of the Gal80 protein is to somehow inhibit GAL gene expression. Conversely, in GAL3 mutants, the GAL structural genes are not active in the presence of galactose, suggesting that Gal3 normally promotes expression of the GAL genes.
Extensive biochemical analyses have revealed that the Gal80 protein binds to the Gal4 protein with high affinity and directly inhibits Gal4 activity. Specifically, Gal80 binds to a region within one of the Gal4 activation domains, blocking its ability to promote the transcription of target genes. The Gal80 protein is expressed continuously, so it is always acting to repress transcription of the GAL structural genes unless stopped. The role of the Gal3 protein is to release the GAL structural genes from their repression by Gal80 when galactose is present.
Gal3 is thus both a sensor and inducer. When Gal3 binds galactose and ATP, it undergoes an allosteric change that promotes binding to Gal80, which in turn causes Gal80 to release Gal4, which is then able to interact with other transcription factors and RNA pol II to activate transcription of its target genes. Thus, Gal3, Gal80, and Gal4 are all part of a switch whose state is determined by the presence or absence of galactose (Figure 12-8). In this switch, DNA binding by the transcriptional regulator is not the physiologically regulated step (as is the case in the lac operon and bacteriophage λ); rather, the activity of the activation domain is regulated.
KEY CONCEPT
The activity of eukaryotic transcriptional regulatory proteins is often controlled by interactions with other proteins.
Gal4 functions in most eukaryotes
In addition to its action in yeast cells, Gal4 has been shown to be able to activate transcription in insect cells, human cells, and many other eukaryotic species. This versatility suggests that biochemical machinery and mechanisms of gene activation are common to a broad array of eukaryotes and that features revealed in yeast are generally present in other eukaryotes and vice versa. Furthermore, because of their versatility, Gal4 and its UAS elements have become favored tools in genetic analysis for manipulating gene expression and function in a wide variety of model systems.
KEY CONCEPT
The ability of Gal4, as well as other eukaryotic regulators, to function in a variety of eukaryotes indicates that eukaryotes generally have the transcriptional regulatory machinery and mechanisms in common.
Now we look at how activators and other regulatory proteins interact with the transcriptional machinery to control gene expression.
Activators recruit the transcriptional machinery
In bacteria, activators commonly stimulate transcription by interacting directly with DNA and with RNA polymerase. In eukaryotes, activators generally work indirectly. Eukaryotic activators recruit RNA polymerase II to gene promoters through two major mechanisms. First, activators can interact with subunits of the protein complexes having roles in transcription initiation and then recruit them to the promoter. Second, activators can recruit proteins that modify chromatin structure, allowing RNA polymerase II and other proteins access to the DNA. Many activators, including Gal4, have both activities. We’ll examine the recruitment of parts of the transcriptional initiation complex first.
Figure 12-9: Transcriptional activator proteins recruit the transcriptional machinery
Figure 12-9: Gal4 recruits the transcriptional machinery. The Gal4 protein, and many other transcriptional activators, binds to multiple protein complexes, including the TFIID and mediator complexes shown here (dotted arrows), that recruit RNA polymerase II to gene promoters. The interactions facilitate gene activation through binding sites that are distant from gene promoters.
Recall from Chapter 8 that the eukaryotic transcriptional machinery contains many proteins that are parts of various subcomplexes within the transcriptional apparatus that is assembled on gene promoters. One subcomplex, transcription factor IID (TFIID), binds to the TATA box of eukaryotic promoters through the TATA-binding protein (TBP; see Figure 8-12). One way that Gal4 works to activate gene expression is by binding to TBP at a site in its activation domain. Through this binding interaction, it recruits the TFIID complex and, in turn, RNA polymerase II to the promoter (Figure 12-9). The strength of this interaction between Gal4 and TBP correlates well with Gal4’s potency as an activator.
A second way that Gal4 works to activate gene expression is by interacting with the mediator complex, a large multiprotein complex that, in turn, directly interacts with RNA polymerase II to recruit it to gene promoters. The mediator complex is an example of a co-activator, a term applied to a protein or protein complex that facilitates gene activation by a transcription factor but that itself is neither part of the transcriptional machinery nor a DNA-binding protein.
The ability of transcription factors to bind to upstream DNA sequences and to interact with proteins that bind directly or indirectly to promoters helps to explain how transcription can be stimulated from more distant regulatory sequences (see Figure 12-9).
KEY CONCEPT
Eukaryotic transcriptional activators often work by recruiting parts of the transcriptional machinery to gene promoters.
The control of yeast mating type: combinatorial interactions
Thus far, we have focused in this chapter on the regulation of single genes or a few genes in one pathway. In multicellular organisms, distinct cell types differ in the expression of hundreds of genes. The expression or repression of sets of genes must therefore be coordinated in the making of particular cell types. One of the best-understood examples of cell-type regulation in eukaryotes is the regulation of mating type in yeast. This regulatory system has been dissected by an elegant combination of genetics, molecular biology, and biochemistry. Mating type serves as an excellent model for understanding the logic of gene regulation in multicellular animals.
The yeast Saccharomyces cerevisiae can exist in any of three different cell types known as a, α, and a/α. The two cell types a and α are haploid and contain only one copy of each chromosome. The a/α cell is diploid and contains two copies of each chromosome. Although the two haploid cell types cannot be distinguished by their appearance in the microscope, they can be differentiated by a number of specific cellular characteristics, principally their mating type (see the Model Organism box). An α cell mates only with an a cell, and an a cell mates only with an α cell. An α cell secretes an oligopeptide pheromone, or sex hormone, called α factor that arrests a cells in the cell cycle. Similarly, an a cell secretes a pheromone, called a factor, that arrests α cells. Cell arrest of both participants is necessary for successful mating. The diploid a/α cell does not mate, is larger than the α and a cells, and does not respond to the mating hormones.
Genetic analysis of mutants defective in mating has shown that cell type is controlled by a single genetic locus, the mating-type locus, MAT. There are two alleles of the MAT locus: haploid a cells have the MATa allele, and the haploid α cells have the MATα allele. The a/α diploid has both alleles. Although mating type is under genetic control, certain strains switch their mating type, sometimes as frequently as every cell division. We will examine the basis of switching later in this chapter, but first, let’s see how each cell type expresses the right set of genes. We will see that different combinations of DNA-binding proteins regulate the expression of sets of genes specific to different cell types.
How does the MAT locus control cell type? Genetic analyses of mutants that cannot mate have identified a number of structural genes that are separate from the MAT locus but whose protein products are required for mating. One group of structural genes is expressed only in the α cell type (α-specific genes), and another set is expressed only in the a cell type (a-specific genes). The MAT locus controls which of these sets of structural genes is expressed in each cell type. The MATa allele causes the structural genes of the a-type cell to be expressed, whereas the MATα allele causes the structural genes of the α-type cell to be expressed. These two alleles activate different sets of genes because they encode different regulatory proteins. In addition, a regulatory protein not encoded by the MAT locus, called MCM1, plays a key role in regulating cell type.
The simplest case is the a cell type (Figure 12-10a). The MATa locus encodes a single regulatory protein, a1. However, a1 has no effect in haploid cells, only in diploid cells. In a haploid a cell, the regulatory protein Mcm1 turns on the expression of the structural genes needed by an a cell, by binding to regulatory sequences within promoters for a-specific genes.
Figure 12-10: Combinations of regulatory proteins control cell types
Figure 12-10: Control of cell-type-specific gene expression in yeast. The three cell types of S. cerevisiae are determined by the regulatory proteins a1, α1, and α2, which regulate different subsets of target genes. The MCM1 protein acts in all three cell types and interacts with α1 and α2.
In an α cell, the α-specific structural genes must be transcribed, but, in addition, the MCM1 protein must be prevented from activating the a-specific genes. The DNA sequence of the MATα allele encodes two proteins, α1 and α2, that are produced by separate transcription units. These two proteins have different regulatory roles in the cell, as can be demonstrated by analyzing their DNA-binding properties in vitro (Figure 12-10b). The α1 protein is an activator of α-specific gene expression. It binds in concert with the MCM1 protein to a discrete DNA sequence controlling several α-specific genes. The α2 protein represses transcription of the a-specific genes. It binds as a dimer, with MCM1, to sites in DNA sequences located upstream of a group of a-specific genes and acts as a repressor.
In a diploid yeast cell, regulatory proteins encoded by each MAT locus are expressed (Figure 12-10c). What is the result? All the structural genes involved in cell mating are shut down, as are a separate set of genes, called haploid specific, that are expressed in haploid cells but not diploid cells. How does this happen? The a1 protein encoded by MATa has a part to play at last. The a1 protein can bind to some of the α2 protein present and alter its binding specificity such that the a1–α2 complex does not bind to a-specific genes. Rather, the a1–α2 complex binds to a different sequence found upstream of the haploid-specific genes. In diploid cells, then, the α2 protein exists in two forms: (1) as an α2–MCM1 complex that represses a-specific genes and (2) in a complex with the a1 protein that represses haploid-specific genes. Moreover, the a1–α2 complex also represses expression of the α1 gene, which is thus no longer present to turn on a-specific genes. The different binding partners determine which specific DNA sequences are bound and which genes are regulated by each α2-containing complex. The regulation of different sets of target genes by the association of the same transcription factor with different binding partners plays a major role in the generation of different patterns of gene expression in different cell types within multicellular eukaryotes.
KEY CONCEPT
In yeast and in multicellular eukaryotes, cell-type-specific patterns of gene expression are governed by combinations of interacting transcription factors.