A major theme of molecular genetics is the central dogma, which states that genetic information flows from DNA to RNA to proteins (see Figure 10.16). Although the central dogma provided a molecular basis for the connection between genotype and phenotype, it failed to address a critical question: how is the flow of information along the molecular pathway regulated?
Consider E. coli, a bacterium that resides in your large intestine. Your eating habits completely determine the nutrients available to this bacterium: it can neither seek out nourishment when nutrients are scarce nor move away when confronted with an unfavorable environment. E. coli makes up for its inability to alter the external environment by being internally flexible. For example, if glucose is present, E. coli uses it to generate ATP; if there’s no glucose, it utilizes lactose, arabinose, maltose, xylose, or any of a number of other sugars. When amino acids are available, E. coli uses them to synthesize proteins; if a particular amino acid is absent, E. coli produces the enzymes needed to synthesize that amino acid. Thus, E. coli responds to environmental changes by rapidly altering its biochemistry. This biochemical flexibility, however, has a high price. Producing all the enzymes necessary for every environmental condition would be energetically expensive. So how does E. coli maintain biochemical flexibility while optimizing energy efficiency?
The answer is through gene regulation. Bacteria carry the genetic information for synthesizing many proteins, but only a subset of this genetic information is expressed at any time. When the environment changes, new genes are expressed, and proteins appropriate for the new environment are synthesized. For example, if a carbon source appears in the environment, genes encoding enzymes that take up and metabolize this carbon source are quickly transcribed and translated. When this carbon source disappears, the genes that encode these enzymes are shut off.
Multicellular eukaryotic organisms face a different dilemma. Individual cells in a multicellular organism are specialized for particular tasks. The proteins produced by a nerve cell, for example, are quite different from those produced by a white blood cell. Although they differ in shape and function, a nerve cell and a blood cell still carry the same genetic instructions.
A multicellular organism’s challenge is to bring about the specialization of cells that have a common set of genetic instructions (the process of development). This challenge is met through gene regulation: all of an organism’s cells carry the same genetic information, but only a subset of genes are expressed in each cell type. Genes needed for other cell types are not expressed. Gene regulation is therefore the key to both unicellular flexibility and multicellular specialization, and it is critical to the success of all living organisms.
445
In bacteria, gene regulation maintains internal flexibility, turning genes on and off in response to environmental changes. In multicellular eukaryotic organisms, gene regulation brings about cellular differentiation.
The mechanisms of gene regulation were first investigated in bacterial cells, in which the availability of mutants and the ease of laboratory manipulation made it possible to unravel the mechanisms. When the study of these mechanisms in eukaryotic cells began, bacterial gene regulation seemed to clearly differ from eukaryotic gene regulation. However, as more and more information has accumulated about gene regulation, a number of common themes have emerged. Today, many aspects of gene regulation in bacterial and eukaryotic cells are recognized to be similar. Before examining specific elements of bacterial gene regulation (this chapter) and eukaryotic gene regulation (Chapter 17), we will briefly consider some themes of gene regulation common to all organisms.
In considering gene regulation in both bacteria and eukaryotes, we must distinguish between the DNA sequences that are transcribed and the DNA sequences that regulate the expression of other sequences. Structural genes encode proteins that are used in metabolism or biosynthesis or that play a structural role in the cell. Regulatory genes are genes whose products, either RNA or proteins, interact with other DNA sequences and affect the transcription or translation of those sequences. In many cases, the products of regulatory genes are DNA-binding proteins (although RNA molecules also affect gene expression). Bacteria and eukaryotes use regulatory genes to control the expression of many of their structural genes. However, a few structural genes, particularly those that encode essential cellular functions, are expressed continually and are said to be constitutive. Constitutive genes are therefore not regulated.
We will also encounter DNA sequences that are not transcribed at all but still play a role in regulating genes and other DNA sequences. These regulatory elements affect the expression of sequences to which they are physically linked. Regulatory elements are common in both bacterial and eukaryotic cells, and much of gene regulation in both types of organisms takes place through the action of proteins produced by regulatory genes that recognize and bind to regulatory elements.
The regulation of gene expression can be through processes that stimulate gene expression, termed positive control, or through processes that inhibit gene expression, termed negative control. Bacteria and eukaryotes use both positive and negative control mechanisms to regulate their genes. However, negative control is more important in bacteria, whereas eukaryotes are more likely to use positive control mechanisms.
Regulatory elements are DNA sequences that are not transcribed but affect the expression of genes. Positive control mechanisms stimulate gene expression, whereas negative control inhibits gene expression.
CONCEPT CHECK 1
What is a constitutive gene?
In both bacteria and eukaryotes, genes can be regulated at a number of levels along the pathway of information flow from genotype to phenotype (Figure 16.1). First, regulation can be through the alteration of DNA or chromatin structure; this type of gene regulation takes place primarily in eukaryotes. Modifications to DNA or its packaging can help to determine which sequences are available for transcription or the rate at which sequences are transcribed. DNA methylation and changes in chromatin are two processes that play a pivotal role in gene regulation.
446
A second point at which a gene can be regulated is at the level of transcription. For the sake of cellular economy, limiting the production of a protein early in the process makes sense, and transcription is an important point of gene regulation in both bacterial and eukaryotic cells. A third potential point of gene regulation is mRNA processing. Eukaryotic mRNA is extensively modified before it is translated: a 5′ cap is added, the 3′ end is cleaved and polyadenylated, and introns are removed (see Chapter 14). These modifications determine the stability of the mRNA, the movement of the mRNA into the cytoplasm, whether the mRNA can be translated, the rate of translation, and the amino acid sequence of the protein produced. There is growing evidence that a number of regulatory mechanisms in eukaryotic cells operate at the level of mRNA processing.
A fourth point for the control of gene expression is the regulation of RNA stability. The amount of protein produced depends not only on the amount of mRNA synthesized, but also on the rate at which the mRNA is degraded. A fifth point of gene regulation is at the level of translation, a complex process requiring a large number of enzymes, protein factors, and RNA molecules (see Chapter 15). All of these factors, as well as the availability of amino acids, affect the rate at which proteins are produced and therefore provide points at which gene expression can be controlled. Translation can also be affected by sequences in mRNA.
Finally, many proteins are modified after translation (see Chapter 15), and these modifications affect whether the proteins become active; genes can be regulated through processes that affect posttranslational modification. Gene expression can be affected by regulatory activities at any or all of these points.
Gene expression can be controlled at any of a number of levels along the molecular pathway from DNA to protein, including DNA or chromatin structure, transcription, mRNA processing, RNA stability, translation, and posttranslational modification.
CONCEPT CHECK 2
Why is transcription a particularly important level of gene regulation in both bacteria and eukaryotes?
Much of gene regulation in bacteria and eukaryotes is accomplished by proteins that bind to DNA sequences and affect their expression. These regulatory proteins generally have discrete functional parts—called domains, typically consisting of 60 to 90 amino acids—that are responsible for binding to DNA. Within a domain, only a few amino acids actually make contact with the DNA. These amino acids (most commonly asparagine, glutamine, glycine, lysine, and arginine) often form hydrogen bonds with the bases or interact with the sugar-phosphate backbone of the DNA. Many regulatory proteins have additional domains that can bind other molecules such as other regulatory proteins. By physically attaching to DNA, these proteins can affect the expression of a gene. Most DNA-binding proteins bind dynamically, which means that they are transiently binding and unbinding DNA and other regulatory proteins. Thus, although they may spend most of their time bound to DNA, they are never permanently attached. This dynamic nature means that other molecules can compete with DNA-binding proteins for regulatory sites on the DNA.
DNA-binding proteins can be grouped into several distinct types on the basis of a characteristic structure, called a motif, found within the binding domain. Motifs are simple structures, such as alpha helices, that can fit into the major groove of the DNA. For example, the helix-turn-helix motif (Figure 16.2a), consisting of two alpha helices connected by a turn, is common in bacterial regulatory proteins. The zinc-finger motif (Figure 16.2b), common to many eukaryotic regulatory proteins, consists of a loop of amino acids containing a zinc ion. The leucine zipper (Figure 16.2c) is another motif found in a variety of eukaryotic binding proteins. These common DNA-binding motifs and others are summarized in Table 16.1.
Motif | Location | Characteristics | Binding Site in DNA |
---|---|---|---|
Helix-turn-helix | Bacterial regulatory proteins; related motifs in eukaryotic proteins | Two alpha helices | Major groove |
Zinc-finger | Eukaryotic regulatory and other proteins | Loop of amino acids with zinc at base | Major groove |
Steroid receptor | Eukaryotic proteins | Two perpendicular alpha helices with zinc surrounded by four cysteine residues | Major groove and DNA backbone |
Leucine-zipper | Eukaryotic transcription factors | Helix of leucine residues and a basic arm; two leucine residues interdigitate | Two adjacent major grooves |
Helix-loop-helix | Eukaryotic proteins | Two alpha helices separated by a loop of amino acids | Major groove |
Homeodomain | Eukaryotic regulatory proteins | Three alpha helices | Major groove |
447
Regulatory proteins that bind DNA have common motifs that interact with sequences in the DNA.
CONCEPT CHECK 3
How do amino acids in DNA-binding proteins interact with DNA?