21.3 TRANSCRIPTIONAL REGULATION MECHANISMS UNIQUE TO EUKARYOTES

Gene regulation is necessarily more complex in eukaryotes than in bacteria, as a consequence of some of the key differences between these two domains of organisms. The larger eukaryotic genomes entail more nonspecific DNA binding and more genes to regulate. And the multicellular nature of most eukaryotes requires mechanisms for development and intercellular communication, the formation and function of intracellular compartments, and the speedy control of gene expression as cells grow and change. We now turn to a discussion of some regulatory processes that are necessary to deal with the complexity of the eukaryotic genome. Such gene control mechanisms provide for situations that arise only in eukaryotes, such as the need to regulate gene dosage in diploid cells.

Insulators Separate Adjacent Genes in a Chromosome

The use of multiple transcription factors to control eukaryotic genes requires dispersed binding sites. Some enhancers are located well over 1,000 bp from the promoters they regulate, or they can be within the gene or at the noncoding 3′ end of the gene. This is quite different from the situation in bacteria, where control elements are almost always located close to, or overlap with, the promoter. As discussed earlier, DNA looping accommodates the large distances between enhancer and promoter elements in eukaryotes. Indeed, the large size of DNA loops provides the flexibility that enables enhancers to function in either orientation. However, this raises a new question: what prevents the enhancer for one gene forming a loop to interact with the promoter of another gene, thereby activating the wrong gene? In part, misregulation of this type is prevented by insulators (sometimes referred to as boundary elements), DNA sequences that form boundaries between genes or groups of genes in eukaryotes.

744

Insulators are relatively short sequences, sometimes fewer than 50 bp. Exactly how insulators function is still unknown, and it is likely that they have a variety of functions. An example of an insulator in T cells (T lymphocytes, white blood cells involved in the immune response) is shown in Figure 21-16. T cells must express either the α chain or the δ chain of the T-cell receptor, but not both. The promoters for these genes are adjacent in human cells, but an insulator sequence between them prevents the enhancer region for one gene activating transcription from the promoter of the other gene. Insulators can also prevent packaging of a gene into heterochromatin. When a gene is experimentally inserted into a heterochromatic region of a chromosome, it is typically repressed. But if the gene also contains an adjacent insulator, the gene remains active even after insertion into a heterochromatic region. An insulator can have a repressive effect if located between the promoter and enhancer of a gene.

Figure 21-16: Insulator regulation of the expression of T-cell receptors. With DNA looping over long distances, binding at enhancers could activate the wrong promoter. The insulator confines the action of enhancers to their matching promoter. In the regulatory regions of the genes shown here, for the α and δ chains of the T-cell receptor, the insulator prevents activation of the α promoter by the δ enhancer, and of the δ promoter by the α enhancer. CTC-binding factor (CTCF) binds to insulators, but how it functions is still uncertain.

All insulator sequences in higher eukaryotes require CTC-binding factor (CTCF) to function (see Figure 21-16). CTCF was first identified as a protein that binds to a site containing a 5′-CCCTC sequence, from which the protein derives its name. The binding site for CTCF is much longer than this sequence, but the flanking sequences are divergent and not as easy to recognize. CTCF contains 11 zinc fingers and binds a diverse set of DNA sequences up to 50 bp long. All insulators seem to require CTCF binding, although the mechanism of the insulators’ function and the role of CTCF are currently unknown. CTCF might recruit other proteins to the insulator.

Some Activators Assemble into Enhanceosomes

Exquisite transcriptional regulation can be achieved through the interaction of multiple activators at a single gene. In some cases, cooperating activators form a stable, tightly folded nucleoprotein complex called an enhanceosome, which integrates regulatory information from multiple signaling cascades and generates a single transcriptional outcome at the target promoter. An example of an enhanceosome can be seen in the regulation of the gene for interferon beta (IFNβ). Interferons are produced in response to a viral infection and lead to programmed cell death, thereby halting further production of viral particles and infection of surrounding cells. At the IFNβ promoter, multiple activators can present their activation domains together to simultaneously interact with the cofactor protein complex CBP-p300 (Figure 21-17). Recruitment of this cofactor is most efficient only when all of the activators in the enhanceosome are present together. The placement of each activator-binding site in the DNA is critical, because of the three-dimensional structure required for enhanceosome function. For example, the experimental insertion of 5 bp (i.e., a half-turn of the helix) between regulatory elements inactivates the function of the IFNβ enhanceosome.

Figure 21-17: Hypothetical structure of the IFNβ enhanceosome. The IFNβ enhancer is referred to as an enhanceosome because, unlike other enhancers, it requires accurate spacing and helical phasing of the DNA between several protein-binding sites. This requirement indicates that a specific tertiary structure is formed with the various regulatory proteins. HMG proteins, while not part of the completed complex, facilitate enhanceosome formation by helping to bend the DNA (see Figure 21-6). Individual regulatory proteins in the complex are shown in different colors.

745

Similar activator clusters can also function together to repress transcription, and it is possible that an enhanceosome can switch to a repressor function under different cellular conditions. Enhanceosomes tend to form at genes that need to be tightly regulated in pathways important to the organism’s defense system, such as wound healing and antipathogen mechanisms.

Gene Silencing Can Inactivate Large Regions of Chromosomes

Thus far we have focused on the activation or repression of gene expression through the action of activator or repressor proteins at single promoters or enhancers. In some cases, however, the position of a gene within the chromatin, or its location on a particular chromosome, leads to almost complete repression of gene expression. This gene silencing, a powerful mode of regulation in eukaryotes, is the absence of gene expression due to the location of the gene, rather than its response to the presence or absence of a regulatory factor or complex. As a result, silencing can encompass relatively large segments of a chromosome, or an entire chromosome, thus controlling the expression of many genes at once.

Recall that chromatin is organized into heterochromatin and euchromatin. The loosely packed euchromatin is often transcriptionally active, whereas the densely packed heterochromatin is transcriptionally silent. Heterochromatin is often found at centromeres and telomeres, as well as other inactive parts of the genome. Experiments have shown that genes normally active in a region of euchromatin become transcriptionally silent when moved into heterochromatic regions. These observations led to the conclusion that a primary function of heterochromatin is to maintain certain parts of the genome in a transcriptionally inactive state by preventing access of the transcription machinery to these chromatin regions. Indeed, burying a gene within heterochromatin may be a preferred mechanism for long-term silencing.

The formation of heterochromatin requires many different factors, depending on the specific region of the chromosome. For example, a mechanism known as gene dosage compensation, occurring in the cells of female mammals, involves the formation of heterochromatin over an entire X chromosome, inactivating it (as we discuss in more detail below). Recent studies of heterochromatin in other regions of a chromosome show that small nuclear RNAs (snRNAs) are required for heterochromatin formation, along with certain proteins and histone modifications. Highlight 21-3 describes studies of heterochromatin in the centromere region of yeast chromosomes that reveal a role of the silencing machinery mediated by small RNAs.

Imprinting Allows Selective Gene Expression from One Allele Only

In most diploid cells, both homologous genes (i.e., both alleles of a gene) are expressed equally. One allele may be dominant over the other, as Mendel found in his work on the garden pea, or the two alleles may deviate from Mendelian behavior and both may contribute to the phenotype. For example, we learned in Chapter 2 that both alleles of the genes responsible for human blood type, when expressed together, can lead to type AB (see Figure 2-5). But regardless of the phenotype, both alleles of a gene are usually expressed in the diploid cell. Some higher eukaryotes, however, have mechanisms to completely shut down the expression of an allele derived from one parent, in a process called imprinting. Because only one parental allele of an imprinted gene is repressed, the usual rule of equal expression of each allele in a diploid cell is violated.

Imprinting is typically restricted to mammals, although some examples have been found in flowering plants. About 80 genes in the human genome are currently known to be imprinted. Imprinting occurs during development of the gametes; a set of genes is imprinted during oocyte development, and a different set of genes is imprinted during sperm cell development. Thus, all gametes from the mother have one set of genes imprinted, and all gametes from the father have a different set of genes imprinted. Nearly every cell of the offspring has the gene expression pattern dictated by the imprinted genes from the egg and sperm. The sole exception is in cells that will give rise to gametes. All imprinting information from the parents is lost during development of the germ cells. Developing gametes adopt the imprinting pattern specific to the sex of that individual organism.

746

HIGHLIGHT 21-3 A CLOSER LOOK: Gene Silencing by Small RNAs

Danesh Moazed

Large tracts of DNA are encased in heterochromatin, a form of DNA so compact that its genes are silenced by the exclusion of RNA polymerase. The nucleosomes in heterochromatin have a distinctive histone modification pattern of methylation and hypoacetylation compared with actively transcribed regions of DNA (euchromatin). The epigenetic marks (marks that are inheritable but do not occur in the DNA) result in stable inheritance of the silent heterochromatin state.

Research from Danesh Moazed’s laboratory shows that the formation of heterochromatin in fission yeast (Schizosaccharomyces pombe) requires the RNA silencing machinery, which localizes to heterochromatin nucleation sites. To study the process of heterochromatin formation, the researchers purified the RNA silencing complex of S. pombe called RITS (RNA-induced transcriptional silencing), and also identified a second complex that interacts and functions with RITS. The second complex is referred to as RDRC (RNA-directed RNA polymerase complex), because it contains an RNA-directed RNA polymerase, Rdp1. RDRC also contains a helicase known as Hrr1, and a putative poly(A) polymerase known as Cid12. A hypothesis for the process of heterochromatin formation at the centromere, based on the possible functions of these complexes, is shown in Figure 1.

FIGURE 1 Proposed model for heterochromatin formation at the centromere in S. pombe. RITS is the RNA-induced transcriptional silencing complex, RDRC is the RNA-directed RNA polymerase complex, and ARC is an Argonaute chaperone complex that contains duplex siRNA. Synthesis of dsRNA and generation of siRNA occur in association with specific chromosome regions. The nascent transcript model proposes that the RITS complex mediates heterochromatin formation by associating with nascent transcripts via siRNA base pairing, and with methylated H3K9 (histone H3 with methylated Lys9). RNA-directed association of RITS with specific transcripts (step 1) leads to RNA-dependent RNA copying (step 2) and processing (step 3) to increase the level of small RNAs that can bind to ARC and associate with RITS to continue the cycle (step 4).

The RITS complex contains siRNAs, derived from heterochromatic regions of the chromatin, and the protein Argonaute, which belongs to a family of proteins implicated in RNA-induced silencing pathways. RITS matches the siRNAs to complementary RNA sites known as cenRNA sequences, which are noncoding transcripts produced at repeat sequences (cen DNA) near the centromeres (see Figure 1, step 1). The RITS complex also contains a chromodomain protein, which binds methylated histones of heterochromatin and probably helps target RITS to heterochromatin. When RITS associates with a growing cenRNA transcript, Rdp1 within the RDRC complex uses the siRNA-cenRNA as a primer to synthesize double-stranded RNA (step 2). Dicer then chops the double-stranded RNA into multiple siRNAs (step 3). Moazed and his colleagues identified another complex, called ARC (Argonaute chaperone), that contains Argonaute and two other protein components. ARC binds individual siRNAs and is thought to chaperone them into the RITS complex (step 4). During this step, the siRNA base-pairs to complementary sequences in heterochromatin, recruiting RITS and completing a self-propagating cycle that expands the heterochromatic region.

Methylation of histone H3 at Lys9, initiated by RNA silencing complexes, can help expand the heterochromatin. The methylation is associated with recruitment of HP-1 protein, which is required for heterochromatin formation. The poly(A) polymerase Cid12 belongs to a family of proteins that target RNAs to the exosome, a large ribonuclease complex that catalyzes RNA degradation. Hence the Cid12 component of RDRC is thought to add another layer of RNA surveillance to ensure that heterochromatic regions are completely silenced.

Although this system was elucidated in a single-celled eukaryote, we know that the S. pombe proteins identified in the processes described here are conserved in higher eukaryotes. Perhaps a similar mechanism of siRNA-mediated heterochromatin formation occurs in mammals.

747

Imprinting of a gene is not based on the DNA sequence; it is an epigenetic process (see Chapter 10). Epigenetic marks are created by nucleosome modification patterns and DNA methylation. Figure 21-18 shows imprinting based on DNA methylation for the mammalian gene for insulin growth factor-2 (IGF-2). In this case, DNA near the paternally inherited IGF2 allele becomes methylated (imprinted) and is active, but DNA near the maternally inherited allele is unmethylated and is not expressed in the offspring. Imprinting of this gene is important, because expression of both alleles tends to result in cancer. The mechanism of imprinting in this case involves an insulator sequence. The IGF2 gene contains an insulator sequence between its promoter and an enhancer, rather than between the gene and an adjacent gene. Thus, when the insulator is active, IGF2 is repressed because the activating effect of the enhancer is insulated from the promoter.

Figure 21-18: Imprinting of the mammalian IGF2 gene. When CTC-binding factor (CTCF) binds the insulator of IGF2 (top), this enables insulator function and turns off IGF2, because the insulator is located between the promoter and the enhancer. When the insulator sequence is methylated (bottom), CTCF can no longer bind the DNA, and IGF2 can be activated by its enhancer.

The biochemical explanation for imprinting in IGF2 is thought to lie in 5-methylation of C residues of CpG sequences (see Figure 6-34a). Cytosine methylation in eukaryotic DNA is generally associated with gene repression, whereas transcriptionally active regions of DNA tend to be undermethylated. In the case of IGF2, cytosines in CpG sequences recognized by CTCF in the insulator sequence that regulates the paternal IGF2 gene become methylated when the gene is imprinted, thereby inactivating IGF2. When CpG sites are methylated, CTCF can no longer bind the insulator. Hence, when the insulator in IGF2 is methylated during sperm development, the paternally inherited copy of IGF2 is expressed because CTCF no longer binds the insulator and the enhancer activates the promoter. The insulator site is not methylated in the egg, so CTCF binds the insulator and represses the maternally inherited copy of IGF2.

Imprinting is essential for development in mammals. Studies in mice have shown that a genetically engineered egg containing two complete sets of maternally inherited chromosomes will not develop past the blastula stage. The same is observed for eggs containing two complete sets of chromosomes from sperm. In either of these situations, the alleles of the genes that are normally differentially marked will have identical imprinting patterns in the developing egg—leading to either no expression or double expression of those genes. For example, if an embryo were to develop from a fully diploid cell in which the two chromosome sets were derived from a female, both IGF2 alleles would be inactive. This explains why mammals are not capable of parthenogenesis—the development of an embryo with a diploid genome that is entirely maternally derived.

It is possible that imprinting evolved by natural selection to increase the fitness of the organism. Imprinting in mammals is thought to confer certain behavioral traits that enhance fitness. One hypothesis proposes that imprinting reflects the different interests of the mother and father in the growth and development of offspring. The father is more interested in seeing that the offspring grow rapidly, regardless of whether this occurs at the expense of the mother. The mother is more interested in balancing her own survival with sufficient nourishment of the offspring and thus tends toward growth-limiting measures that conserve resources. In support of this “parental conflict” hypothesis, male-expressed imprinted genes tend to promote growth, and female-expressed imprinted genes tend to limit growth. The model is also supported by the lack of imprinting in animals, such as birds, that have lower requirements for raising offspring.

Dosage Compensation Balances Gene Expression from Sex Chromosomes

Diploid organisms carry two copies of each autosomal chromosome, but the sex chromosomes are unequal in copy number in females and males. In mammals, females have two copies of the X chromosome, and males have one X chromosome and a Y chromosome. The X chromosome carries genes that are required by both males and females, and the gene products are required in the same amounts in both sexes. Dosage compensation mechanisms have evolved to control the level of gene expression from these chromosomes so that the levels are similar in males and females. We can imagine three different ways in which gene dosage compensation could occur. (1) Total inactivation of one X chromosome in the female would make gene expression equal to that from the single X chromosome in the male. (2) Expression of the single X chromosome in the male could be doubled. (3) Expression of the two X chromosomes in the female could be halved. All three of these mechanisms are employed in one species or another (Figure 21-19).

Figure 21-19: Different dosage compensation mechanisms control the level of gene expression from the X chromosomes. (a) In mammals, one X chromosome of the female (XX) is inactivated, forming a compact structure called a Barr body. (b) In Drosophila, the single X chromosome of the male (XY) is transcribed at twice the level of each X chromosome in the female (XX). (c) In C. elegans, the two X chromosomes of the hermaphrodite (XX) are transcribed at half the level of the single X chromosome of the male (X0).

748

In mammals, one X chromosome of female cells is inactivated, compacted into a tightly condensed structure called a Barr body. This process of X chromosome inactivation starts at the X inactivation center (XIC), a region of about 106 bp near the middle of the X chromosome, which condenses into heterochromatin. Condensation spreads from this nucleation point in both directions until the entire X chromosome is compacted (Figure 21-20). The process involves XIST, an RNA produced from the XIC DNA in the inactivated chromosome. XIST is not translated into protein, but instead coats the chromosome non-sequence-specifically at the XIST locus, and then spreads in both directions. In addition to XIST, X chromosome inactivation in mammals also involves a histone variant, macroH2A (see Highlight 10-1).

Figure 21-20: X chromosome inactivation in mammals. Inactivation of one X chromosome in each cell of the female mammal begins at the XIC locus, which encodes XIST (a non-protein-coding RNA), and also requires the histone variant macroH2A and other chromatin regulatory factors. XIST coats the X chromosome, starting at the nucleation site and then spreading in both directions until the entire chromosome is coated, resulting in repression of nearly all the genes on the chromosome.

In humans, inactivation occurs on only one of the two X chromosomes in each cell and thus seems to be a type of imprinting. However, the selection of which X chromosome becomes inactive is random; either the maternal or the paternal X chromosome is inactivated in any given cell. Therefore, this X chromosome inactivation is not strictly imprinting. There are some tissues in which X chromosome inactivation is a true example of imprinting in humans: in the umbilical cord and placental tissue, only the paternal chromosome is inactivated.

In Drosophila, male cells contain one X chromosome and solve the gene dosage problem in a different way: genes on the male’s X chromosome are transcribed at twice the level of genes on each of the female’s two X chromosomes. This overactivation arises from coating of the chromosome with an X-encoded RNA-protein complex called the dosage compensation complex (DCC), which contains five different proteins and two noncoding RNAs. Two of the proteins have HAT and phosphokinase activities. The DCC may enhance the transcription of genes on the male’s X chromosome by modifying chromatin through these two enzymatic activities.

The nematode C. elegans solves the gene dosage problem in the opposite way. The two sexes in C. elegans are the hermaphrodite (XX) and the male (X0). Expression from each of the two X chromosomes in the hermaphrodite is reduced by half relative to gene expression from the single X chromosome of the male. Like Drosophila, C. elegans has a dosage compensation complex, but here its function is to repress gene transcription instead of activating it. The DCC is expressed only in cells with two X chromosomes (i.e., in the hermaphrodite), where it coats both X chromosomes. The C. elegans DCC is composed of proteins that resemble the condensing complex of mitotic and meiotic chromosomes. Thus, evolution has modified and recruited these chromosome-condensing proteins to the task of gene dosage compensation. The DCC partially condenses both X chromosomes such that transcription is reduced by half, balancing the gene dosage with that of the male cell with its single X chromosome. Targeting of the C. elegans gene dosage complex to X chromosomes is accomplished by DNA sequence elements dispersed along the X chromosome that nucleate the complex. Nucleation is followed by spreading of the complex across the entire chromosome, maintaining the repressed epigenetic state throughout the life of the hermaphrodite.

749

Steroid Hormones Bind Nuclear Receptors That Regulate Gene Expression

Whereas dosage compensation provides for inherited control of gene expression levels, transient changes in transcription are often triggered in response to hormones. Intercellular communication is essential in multicellular organisms; tissues, organs, and organ systems need to work together and must be able to respond to external signals. One group of molecular signals is the steroid hormones, which operate in the nucleus to activate transcription of particular genes in response to tissue or system requirements.

The effects of steroid hormones (and thyroid and retinoid hormones, which have the same mode of action) provide well-studied examples of the modulation of eukaryotic regulatory proteins by direct interaction with molecular signals. Steroid hormones too hydrophobic to dissolve readily in the blood (e.g., estrogen, progesterone, and cortisol) travel on specific carrier proteins from their point of release to their target tissues, where the hormone enters cells by simple diffusion and binds to its specific nuclear receptor protein, which is a transcription activator.

There are two major types of steroid-binding nuclear receptors: those initially located in the cytoplasm (type I) and those always located in the nucleus, bound to DNA (type II). Examples of steroid hormones that bind type I nuclear receptors are estrogen, progesterone, androgens, and glucocorticoids. The action of type I nuclear receptors is shown in Figure 21-21a. The nuclear receptor (NR) is initially bound to a heat shock protein (Hsp70) in the cytoplasm, keeping the receptor in its monomeric state. On binding the steroid hormone, Hsp70 dissociates, the receptor dimerizes and exposes a nuclear import signal, and the receptor-hormone complex migrates into the nucleus, where it acts as a transcription factor.

Figure 21-21: Steroid hormone receptor action. Steroid hormones diffuse across the plasma membrane and associate with a type I or type II nuclear receptor. (a) The type I nuclear receptor (NR), located in the cytoplasm, is complexed with a heat shock protein (Hsp70). Hormone binding releases Hsp70, and the NR dimerizes and exposes a nuclear import signal sequence. The NR-hormone complex then enters the nucleus and binds to a hormone response element (HRE) to activate transcription. (b) Type II nuclear receptors are bound to DNA whether or not the hormone signal is present. For example, the thyroid hormone receptor (TR) forms a heterodimer with the protein RXR to bind the HRE, but it is inactive without thyroid hormone. When the hormone enters the cell and the nucleus and binds the complex at the HRE site, it activates gene transcription.

Type II receptors also require binding of the hormone before activating transcription, but these receptors are already bound to the DNA, whether their molecular signal (steroid hormone) is present or not. In addition, type II receptors typically bind DNA as a heterodimer. Thyroid hormone receptor (TR) is an example of a type II nuclear receptor (Figure 21-21b). TR forms a heterodimer with a protein known as the retinoid X receptor (RXR, another type II receptor) and, in the absence of thyroid hormone, also binds a third protein, a corepressor. This complex does not activate transcription. When thyroid hormone enters the nucleus and binds to TR, this releases the corepressor and promotes binding of a coactivator, resulting in recruitment of Pol II. Expression of the thyroid hormone–induced genes produces proteins involved in metabolism and in regulation of heart rate.

Steroid hormone–nuclear receptor complexes act by binding to highly specific DNA sequences known as hormone response elements (HREs). The bound hormone-receptor complex can either enhance or suppress the expression of adjacent genes. The HREs for the various steroid hormones are similar in length and organization in the genome, but they differ in sequence. Each receptor has a consensus HRE sequence to which the hormone-receptor complex binds well (Figure 21-22). The consensus sequence consists of two six-nucleotide sequences, either contiguous or separated by three nucleotides, in tandem or inverted with respect to each other. The steroid hormone receptors have a highly conserved DNA-binding domain with two zinc fingers. The hormone-receptor complex binds to the DNA as a dimer, and the zinc finger domains of each monomer recognize the six-nucleotide HRE sequences. The ability of a given hormone to act through its hormone-receptor complex to alter the expression of a specific gene depends on the exact sequence of the HRE, its position relative to the gene, and the number of HREs associated with the gene.

Figure 21-22: Structural organization of steroid hormone receptors and hormone response elements. Nuclear receptors are multidomain proteins (top, showing the three domains) that bind steroid hormones and DNA to activate transcription. As shown in the enlarged structure of the DNA-binding region (middle), two adjacent zinc fingers bind to the HRE in the DNA (bottom; the binding regions are indicated by dashed lines). The receptors bind the DNA as dimers. The HREs of several steroid hormone receptors are inverted repeats (highlighted in yellow and green).

The ligand-binding region of the steroid hormone receptor protein—always located at the C-terminus—is specific to the particular receptor. For example, the ligand-binding region of the glucocorticoid receptor shares only 30% sequence similarity with the estrogen receptor, and only 17% similarity with the thyroid hormone receptor. The size of the ligand-binding region also varies dramatically: the vitamin D receptor has only 25 residues, whereas the mineralocorticoid receptor has 603 residues in this region. Mutations that change one amino acid in the ligand-binding region can result in loss of responsiveness to a specific hormone. In humans, medical conditions resulting from the inability to respond to cortisol, testosterone, vitamin D, or thyroid hormone are caused by mutations of this type.

750

Responsiveness to a steroid hormone is tissue-specific. The specificity is due in part to transcriptional regulation of the gene encoding the hormone receptor. Cells that are not responsive to a particular steroid hormone do not seem to express its receptor. This can be seen experimentally by using radioactive steroid hormone to examine different tissues for accumulation of the hormone in the nucleus. For example, radioactive progesterone accumulates in the nuclei of endometrial cells, which prepare the uterus for pregnancy, but not in the nuclei of other tissues such as muscle.

Nonsteroid Hormones Control Gene Expression by Triggering Protein Phosphorylation

Nonsteroid hormones, grouped together as chemically distinct from steroid hormones, cannot cross the plasma membrane. Instead, they deliver their regulatory message via a cell surface receptor. We saw in Chapter 19 how the effects of insulin on gene expression are mediated by a series of steps leading ultimately to activation of a protein kinase that phosphorylates specific DNA-binding proteins. Phosphorylation alters the ability of the proteins to act as transcription factors (see Highlight 19-1). This general mechanism mediates the effects of many nonsteroid hormones on gene regulation.

A widely used mechanism of signal transduction for many nonsteroid hormones and other ligands involves the action of G protein–coupled receptors (GPCRs) that span the plasma membrane. In this pathway, a transmembrane GPCR binds the signal molecule on the outside of the cell, and binding activates a guanine nucleotide–binding protein (G protein) on the cytoplasmic side of the membrane. G proteins function as molecular switches: when bound to GTP they are active; on hydrolysis of the GTP to GDP they are inactive. When activated, G proteins promote the phosphorylation of proteins that activate gene transcription.

751

The many different types of nonsteroid ligands that function through GPCRs include olfactory molecules such as odorants and pheromones; peptide hormones such as insulin, calcitonin, follicle-stimulating hormone, and oxytocin; neurotransmitters such as dopamine, epinephrine (adrenaline), and acetylcholine; glucagon; prostaglandins; and leukotrienes. Ligand binding to GPCRs triggers a wide variety of physiological processes. For example, serotonin and dopamine act through GPCRs in the mammalian brain to regulate mood and behavior. Glucagon and prostaglandins bind to GPCRs to trigger changes in metabolism and contraction of smooth muscle. The malfunctioning of GPCRs is associated with a range of human diseases, and they are the target of more than 25% of pharmaceuticals used in medicine.

A G protein–coupled pathway is shown in Figure 21-23. First, a signal ligand binds to and activates a surface receptor—the GPCR—that spans the plasma membrane. The signal is then transduced through a G protein to activate adenylyl cyclase, the enzyme that converts ATP to cyclic AMP (cAMP), leading to elevated levels of cytosolic cAMP. Recall from Chapter 20 that bacteria also use cAMP as a signal molecule; the cAMP binds directly to a transcription activator such as CRP. Eukaryotic cells use cAMP in a very different way. Instead of binding directly to a transcription factor, cAMP acts as a second messenger that carries a message received from outside the cell (from the first messenger) to proteins inside the cell. The target of cAMP is a kinase called cyclic AMP–dependent protein kinase A (PKA). It is bound in the cytoplasm by a regulatory subunit of the adenylyl cyclase holoenzyme that inhibits its kinase activity. When cAMP binds to the regulatory protein, the protein dissociates, releasing active PKA.

Figure 21-23: Gene expression regulated by protein phosphorylation and cAMP. Cyclic AMP–dependent protein kinase A (PKA) is repressed by a regulatory subunit of the adenylyl cyclase holoenzyme and becomes active only on binding of cAMP to this subunit. The cAMP is produced when a signal molecule binds a transmembrane receptor and induces it to activate adenylyl cyclase. (Several steps are omitted here.) Once active, PKA catalytic subunits enter the nucleus and phosphorylate various target proteins, such as CREB, which then recruits RNA polymerase to DNA.

PKA acts on many different target proteins, leading to the activation or repression of various sets of genes. Figure 21-23 shows the activation of CREB (cAMP-responsive element–binding protein) by phosphorylation. CREB is a transcription activator that is inactive when unphosphorylated, but when phosphorylated and activated by PKA, CREB binds its CRE (cAMP-response element) site in the DNA. It then activates transcription through a coactivator, CBP (CREB-binding protein) of the CBP-p300 complex (shown as part of an enhanceosome in Figure 21-17). CBP is a coactivator for numerous genes, including genes encoding other transcription activators, and it functions in many organs. Most of its effects are still unknown. The most widely studied CREB functions are related to the brain, where CREB is implicated in the formation of long-term memories.

752

SECTION 21.3 SUMMARY

  • Insulators are DNA sequences that prevent transcription factors bound at distant enhancers from activating the wrong promoters.

  • Enhanceosomes are stable, tightly folded nucleoprotein complexes in which cooperating activators integrate regulatory information from multiple signals to produce a single transcriptional outcome at the target promoter.

  • Some genes are blocked from active transcription within regions of densely packed heterochromatin. The formation of heterochromatin requires small RNAs, as well as proteins that condense the DNA.

  • In imprinting, which occurs in the genes of some higher eukaryotes, the expression of an allele derived from one parent is shut down. Imprinting is an epigenetic process based on nucleosome modification patterns and DNA methylation.

  • Gene dosage compensation, required because of the different number of X chromosomes in males and females, is achieved in one of three ways, depending on the organism. A protein-RNA complex covers the X chromosome(s) to inactivate one female X chromosome, double the expression of the single male X chromosome, or halve gene expression from each female X chromosome.

  • Steroid hormones control the transcription of specific genes by interacting with intracellular receptors that are transcription activators. Hormone binding triggers interaction of receptor proteins with additional transcription factors. Hormone-receptor complexes bind hormone response elements in the DNA, altering gene expression.

  • Nonsteroid hormones and other signal molecules regulate genes through binding to cell surface receptors, triggering phosphorylation of “second messenger” proteins that leads to modulation of gene expression.

UNANSWERED QUESTIONS

Eukaryotic cells contain more DNA and more genes than do bacteria, in keeping with their larger size, intracellular compartmentation, and cooperation within multicellular organisms. Eukaryotes also differ from bacteria in that they have nucleosomes, which compact the DNA and form different types of chromatin structure, depending on epigenetic alterations of the DNA and histone subunits. All of these differences necessitate greater complexity in gene regulation in eukaryotes. Although research has taught us much about eukaryotic gene expression, numerous questions have yet to be answered.

  1. Why do eukaryotic genes need so many different regulatory protein–binding sites? Given their greater genomic complexity, we might expect eukaryotes to need more gene-regulatory elements than bacteria. But some eukaryotic genes have so many regulatory sites that it is hard to understand what they all do. Some coactivators even have enzymatic activity that modifies proteins, such as RNA polymerase (see Chapter 16) or histones (see Chapter 10). For genes regulated by enhanceosomes, why are so many proteins required to come together to activate a single gene? An exciting area of study will be to gain a deeper understanding of how transcription modulators function.

  2. Do different gene-regulatory processes intertwine? Transcription, mRNA processing, replication, recombination, and repair all occur in the nucleus. It seems possible that additional levels of gene regulation might be achieved by interconnections among these different processes. There is also evidence that transcription and mRNA splicing are coordinated. Future studies are likely to reveal increasingly complex regulatory networks among these diverse processes.

  3. How is heterochromatin assembled and regulated? The role of RNA-mediated silencing machinery in assembling heterochromatic regions of a chromosome is a fascinating topic. Studies suggest that different areas of heterochromatin may have unique mechanisms of formation, depending on their location along the chromosome. Indeed, heterochromatin formation in X chromosome inactivation occurs by a different process than heterochromatin formation at a centromere. Understanding the generation of this important epigenetic silencing mechanism and how heterochromatin formation is regulated during differentiation are important avenues of future research.

753

Transcription Factors Bind Thousands of Sites in the Fruit Fly Genome

Li, X., S. MacArthur, R. Bourgon, D. Nix, D.A. Pollard, V.N. Iyer, A. Hechmer, L. Simirenko, M. Stapleton, C.L. Luengo Hendriks, et al. 2008. Transcription factors bind thousands of active and inactive regions in the Drosophila blastoderm. PLoS Biol. 6(2):e27, doi: 10.1371/journal.pbio.0060027.

Mark Biggin

Until recently, much of the research on transcription factor binding to DNA focused on experiments with purified proteins and short DNA sequences in vitro. Mark Biggin, of the Lawrence Berkeley National Laboratory, wondered how transcription factors might interact with DNA in living cells. Using Drosophila embryos (undergoing a transcriptionally controlled program of anterior-posterior segmentation) and the ChIP-Chip method (see Figure 10-21), Biggin and his colleagues set out to identify the binding sites for six transcription factors known to be active at this stage of fruit fly development. In the ChIP (chromatin immunoprecipitation) part of the experiment, chromatin from the embryos was chemically cross-linked to bound proteins, then purified by precipitation with antibodies to the six transcription factors. Next, in the Chip (DNA microarray chip analysis) part, the DNA in the immunoprecipitated samples was identified using microarray chips containing short DNA segments corresponding to every sequence in the fruit fly genome.

The results were surprising (Figure 1). The six transcription factors were bound to several thousand DNA segments located near half of all the protein-coding genes in the Drosophila genome! These binding sites corresponded to many more sequences, and many more genes, than the transcription factors were thought to regulate, based on DNA-binding preferences determined in vitro. However, only some of the in vivo binding sites showed up repeatedly in the data analysis, indicating that these sites are frequently occupied by the transcription factors. These high-occupancy sites correspond to DNA targets that are almost certainly regulatory, given their proximity to genes that are activated during fruit fly development. The remainder of the in vivo binding sites are less frequently bound, suggesting that they may not be used to regulate transcription. Instead, these may represent sites where transcription factors can bind nonproductively, perhaps as part of their search for higher-affinity sites along the chromatin. Biggin and coworkers’ study should open the way for further investigation of the binding and regulatory roles of transcription factors in the context of chromatin-packaged DNA, as well as the myriad other proteins and regulatory factors found in vivo.

FIGURE 1 The patterns of mRNA expression for six transcription factors in the Drosophila embryo show that each factor is expressed in a unique subset of cells. The fruit fly embryos are shown with the anterior end to the left and the dorsal surface at the top.

754

Muscle Tissue Differentiation Reveals Surprising Plasticity in the Basal Transcription Machinery

Deato, M.D.E., and R. Tjian. 2007. Switching of the core transcription machinery during myogenesis. Genes Dev. 21:2137–2149.

Hu, P., K.G. Geles, J.H. Paik, R.A. DePinho, and R. Tjian. 2008. Codependent activators direct myoblast-specific MyoD transcription. Dev. Cell 15:534–546.

Like many tissue-development processes, muscle differentiation begins with the development of progenitor cells into cells with more specialized functions. In mammalian muscle, myoblasts, the precursor cells, differentiate into myotubes, which subsequently form the muscle fibers of skeletal muscle tissue. The transformation of myoblasts to myotubes involves both selective gene silencing and gene activation pathways. Transcriptional regulation in these cells has long been known to require cell type–specific basic helix-loop-helix activator proteins, and many researchers suspected that these activators somehow modify the function of the basal transcription machinery in developing muscle.

To test this idea directly, Robert Tjian and his colleagues examined mouse myoblasts to determine which transcription factors are important for the differentiation process. The researchers used the Western blot method (see Figure 7-25), treating cell extracts with antibodies that recognize specific transcription factors, including the TATA-associated factors (TAFs) and TATA-binding protein (TBP) components of the TFIID general transcription factor complex, as well as TAFs present only in certain cell types. Because TFIID is part of the basal transcription machinery thought to be common to all cells, Tjian and coworkers expected to find it in muscle cells harvested at all stages of differentiation, along with variable levels of muscle-specific TAFs.

What they discovered instead was that an alternative form of general transcription factor complex, containing the activator proteins TAF3 and TRF3 in place of TFIID, initially coexists with the TFIID-containing core complex in myoblasts. As the cells differentiate into myotubes, however, TFIID decreases to undetectable levels, while TAF3 and TRF3 levels are maintained and eventually become dominant (Figure 2a). When Tjian and colleagues used short interfering RNAs (siRNAs) to reduce the amount of either TAF3 or TRF3 in myoblasts (in experiments not shown here), the expression of the muscle-specific protein MyoD also dropped, and muscle differentiation was compromised. These effects could be reversed by supplying fresh TAF3 and TRF3 to depleted myoblast cells.

FIGURE 2 (a) Gels resulting from Western blot analysis of TFIID components (TAFs and TBP) involved in differentiation of myoblasts to myotubes in mouse muscle tissue. TFIID is represented by its component TBP; the TAF3-TRF3 complex is represented by TAF3. (b) A proposed model for cell differentiation from myoblast to myotube cells. A core transcription initiation complex including TAF3 and TRF3 functionally replaces the canonical TFIID complex in myotube cells, switching on the unique transcription pattern required during cell type–specific terminal differentiation.

These findings implicate TAF3-TRF3 complexes in the transcription of proteins central to the muscle cell differentiation pathway (Figure 2b). More importantly, they suggest that previously unexpected changes in the basal transcription machinery are required for the widespread changes in transcription patterns responsible for cellular differentiation in higher eukaryotes.

755