Regulatory Elements in Eukaryotic DNA Are Found Both Close to and Many Kilobases Away from Transcription Start Sites

Direct measurements of the transcription rates of multiple genes in different cell types have shown that regulation of transcription, either at the initiation step or during elongation in the promoter-proximal region, is the most widespread form of gene control in eukaryotes, as it is in bacteria. In eukaryotes, as in bacteria, a DNA sequence that specifies where RNA polymerase binds and initiates transcription of a gene is called a promoter. Transcription from a particular promoter is controlled by DNA-binding proteins that are functionally equivalent to bacterial repressors and activators. However, eukaryotic transcriptional regulatory proteins can often function either to activate or to repress transcription, depending on their associations with other proteins. Consequently, they are more generally called transcription factors.

The DNA control elements in eukaryotic genomes to which transcription factors bind are often located much farther from the promoter they regulate than is the case in bacterial genomes. In some cases, transcription factors bind at regulatory sites tens of thousands of base pairs either upstream (opposite to the direction of transcription) or downstream (in the same direction as transcription) from the promoter. As a result of this arrangement, transcription of a single gene may be regulated by the binding of multiple different transcription factors to alternative control elements, which direct expression of the same gene in different types of cells and at different times during development.

For example, several separate transcription-control regions regulate expression of the mammalian gene encoding the transcription factor Pax6. As mentioned in Chapter 1, Pax6 protein is required for development of the eye. Pax6 is also required for the development of certain regions of the brain and spinal cord, and the cells in the pancreas that secrete hormones such as insulin. As also mentioned in Chapter 1, heterozygous humans with only one functional Pax6 gene are born with aniridia, a lack of irises in the eyes (see Figure 1-30d). In mammals, the Pax6 gene is expressed from at least three alternative promoters that function in different cell types and at different times during embryogenesis (Figure 9-9a).

365

image
FIGURE 9-9 Transcription-control regions of the mouse Pax6 gene and the orthologous human PAX6 gene. (a) Three alternative Pax6 promoters are used at distinct times during embryogenesis in different tissues of the developing mouse embryo. Transcription-control regions regulating expression of Pax6 in different tissues are indicated by colored rectangles. These control regions are some 200–500 bp in length. (b) Expression of a β-galactosidase reporter transgene fused to the 8 kb of mouse DNA upstream from exon 0. A transgenic mouse embryo 10.5 days after fertilization was stained with X-gal to reveal β-galactosidase. Lens pit (LP) is the tissue that will develop into the lens of the eye. Expression was also observed in tissue that will develop into the pancreas (P). (c) Expression in a mouse embryo at 13.5 days after fertilization of a β-galactosidase reporter gene linked to the sequence in part (a) between exons 4 and 5 marked Retina. Arrow points to nasal and temporal regions of the developing retina. (d) Human PAX6 control regions identified in the 600-kb region of human DNA between the upstream gene RCN1 and the promoter of the downstream ELP4 gene. RCN1 and ELP4 are transcribed in the opposite direction from PAX6, as represented by the leftward-pointing arrows associated with their first exons. RCN1 and ELP1 exons are shown as black rectangles below the line representing this region of human DNA. PAX6 exons are diagrammed as red rectangles above the line. The three PAX6 promoters first characterized in the mouse are shown by rightward arrowheads, and the control regions shown in (a) are represented by gray rectangles. Regions flanking the gene where the sequence is partially conserved in most vertebrates (as in Figure 9-10a) are shown as ovals. Colored ovals represent sequences that cause expression of the transgene in specific neuroanatomical locations in the zebrafish central nervous system. Ovals with the same color stimulated expression in the same region. Gray ovals represent conserved sequences that did not stimulate reporter-gene expression in the developing zebrafish embryo, or were not tested. Such conserved regions may function only in combination, or they may have been conserved for some reason other than regulation of transcription, such as proper folding of the chromosome into topological domains (see Figure 8-34).
[Part (a) data from B. Kammendal et al., 1999, Devel. Biol. 205:79. Part (b) republished with permission of Elsevier, B. Kammendal et al., “Distinct cis-essential modules direct the time-space pattern of the Pax6 gene activity,” Developmental Biology, 1999, 205(1): 79–97; permission conveyed through Copyright Clearance Center, Inc. Part (c) courtesy of Peter Gruss and Birgitta Kammandel. Part (d) data from S. Batia et al., 2014, Devel. Biol. 387:214.]

Researchers often analyze transcription-control regions by preparing recombinant DNA molecules that combine a fragment of DNA to be tested with the coding region for a reporter gene whose expression is easily assayed. Typical reporter genes include the gene that encodes luciferase, an enzyme that generates light that can be assayed with great sensitivity and over many orders of magnitude of intensity using a luminometer. Other frequently used reporter genes encode green fluorescent protein (GFP), which can be visualized by fluorescence microscopy (see Figures 4-9d and 4-16), and E. coli β-galactosidase, which generates an intensely blue insoluble precipitate when incubated with the colorless soluble lactose analog X-gal. When transgenic mice (see Figure 6-40) containing a β-galactosidase reporter gene fused to 8 kb of DNA upstream from Pax6 exon 0 were produced, β-galactosidase was observed in the developing lens, cornea, and pancreas of the embryo halfway through gestation (Figure 9-9b). Analysis of transgenic mice with smaller fragments of DNA from this region allowed the mapping of the separate transcription-control regions regulating transcription in the pancreas, and in both the lens and cornea. Transgenic mice with other reporter gene constructs revealed additional transcription-control regions (see Figure 9-9a). These regions control transcription in the developing retina and in different regions of the developing brain (encephalon). Some of these transcription-control regions are in introns between exons 4 and 5 and between exons 7 and 8. For example, a reporter gene under control of the region labeled Retina in Figure 9-9a between exons 4 and 5 led to reporter-gene expression specifically in the retina (Figure 9-9c).

366

Control regions for many genes are found hundreds of kilobases away from the coding exons of the gene. One method for identifying such distant control regions is to compare the sequences of distantly related organisms. Transcription-control regions for a conserved gene are also often conserved and can be recognized in the background of nonfunctional sequences that diverge during evolution.

For example, there is a human DNA sequence, which is highly conserved between humans, mice, chickens, frog, and fish, about 500 kb downstream of the SALL1 gene (Figure 9-10a). SALL1 encodes a transcription factor required for normal development of the limbs. When transgenic mice were produced containing this conserved DNA sequence linked to a β-galactosidase reporter gene (Figure 9-10b), the transgenic embryos expressed a very high level of β-galactosidase in the developing limb buds (Figure 9-10c). Human patients with deletions in this region of the genome develop with limb abnormalities. These results indicate that this conserved region directs transcription of the SALL1 gene in the developing limb. Presumably, other transcription-control regions control expression of this gene in other types of cells, where it functions in the normal development of the ears, the lower intestine, and kidneys.

image
FIGURE 9-10 The human SALL1 enhancer activates expression of a reporter gene in limb buds of the developing mouse embryo. (a) Graphic representation of the conservation of DNA sequence in a region of the human genome (in the interval of chromosome 16 from 50214 kb to 50220.5 kb) about 500 kb downstream from the SALL1 gene, which encodes a zinc-finger transcription repressor. A region of roughly 500 bp of nonprotein-coding sequence is conserved from zebrafish to human. Nine hundred base pairs of human DNA including this conserved region were inserted into a plasmid next to the coding region for E. coli β-galactosidase. (b) The plasmid was microinjected into a pronucleus of a fertilized mouse egg and implanted in the uterus of a pseudopregnant mouse to generate a transgenic mouse embryo with the reporter-gene-containing plasmid incorporated into its genome (see Figure 5-43). (c) After 11.5 days of development, at the time when limb buds develop, the fixed and permeabilized embryo was incubated in X-gal, which is converted by β-galactosidase into an insoluble, intensely blue compound. The results showed that the conserved region contains an enhancer that stimulates strong transcription of the β-galactosidase reporter gene specifically in limb buds.
[Part (a) data from A. Visel et al., 2007. VISTA Enhancer Browser—a database of tissue-specific human enhancers. Nucleic Acids Res. 35:D88–92. Part (b) ©Deco/Alamy. Part (c) republished with permission of Nature, from Pennacchio, L.A., et al., “In vivo enhancer analysis of human conserved non-coding sequences”, Nature, 444, 499–506, 2006; permission conveyed through Copyright Clearance Center, Inc.]

Because the sequences and functions of transcription-control regions are often conserved through evolution, the transcription factors that bind to these transcription-control regions to regulate gene expression in specific cell types are presumably conserved during evolution as well. This has made it possible to assay control regions in human DNA by reporter-gene expression in transgenic zebrafish, a procedure that is far simpler, faster, and less expensive than preparing transgenic mice (Figure 9-9d). After discussing the proteins that function with RNA polymerase to carry out transcription in eukaryotic cells and eukaryotic promoters, we will return to a discussion of how such distant transcription-control regions, called enhancers, are thought to function.

367