31.1 Many DNA-Binding Proteins Recognize Specific DNA Sequences

How do regulatory systems distinguish the genes that need to be activated or repressed from genes that are constitutive? After all, the DNA sequences of genes themselves do not have any distinguishing features that would allow regulatory systems to recognize them. Instead, gene regulation depends on other sequences in the genome. In prokaryotes, these regulatory sites are close to the region of the DNA that is transcribed. Regulatory sites are usually binding sites for specific DNA-binding proteins, which can stimulate or repress gene expression. These regulatory sites were first identified in E. coli in studies of changes in gene expression. In the presence of the sugar lactose, the bacterium starts to express a gene encoding β-galactosidase, an enzyme that can process lactose for use as a carbon and energy source. The sequence of the regulatory site for this gene is shown in Figure 31.1. The nucleotide sequence of this site shows a nearly perfect inverted repeat, indicating that the DNA in this region has an approximate twofold axis of symmetry. Recall that cleavage sites for restriction enzymes such as EcoRV have similar symmetry properties (Section 9.3). Symmetry in such regulatory sites usually corresponds to symmetry in the protein that binds the site. Symmetry matching is a recurring theme in proteinDNA interactions.

Figure 31.1: Sequence of the lac regulatory site. The nucleotide sequence of this regulatory site shows a nearly perfect inverted repeat, corresponding to twofold rotational symmetry in the DNA. Parts of the sequences that are related by this symmetry are shown in the same color.

To understand these protein–DNA interactions in detail, scientists examined the structure of the complex between an oligonucleotide that includes this site and the DNA-binding unit that recognizes it (Figure 31.2). The DNA-binding unit comes from a protein called the lac repressor, which represses the expression of the lactose-processing gene. As expected, this DNA-binding unit binds as a dimer, and the twofold axis of symmetry of the dimer matches the symmetry of the DNA. An α helix from each monomer of the protein is inserted into the major groove of the DNA, where amino acid side chains make specific contacts with exposed edges of the base pairs. For example, the side chain of an arginine residue of the protein forms a pair of hydrogen bonds with a guanine residue of the DNA, which would not be possible with any other base. This interaction and similar ones allow the lac repressor to bind more tightly to this site than to the wide range of other sites present in the E. coli genome.

Figure 31.2: The lac repressor–DNA complex. The DNA-binding domain from a gene-regulatory protein, the lac repressor, binds to a DNA fragment containing its preferred binding site (referred to as operator DNA) by inserting an α helix into the major groove of operator DNA. Notice that a specific contact forms between an arginine residue of the repressor and a G–C base pair in the binding site.
[Drawn from 1EFA.pdb.]

927

The helix-turn-helix motif is common to many prokaryotic DNA-binding proteins

Are similar strategies utilized by other prokaryotic DNA-binding proteins? The structures of many such proteins have now been determined, and amino acid sequences are known for many more. Strikingly, the DNA-binding surfaces of many, but not all, of these proteins consist of a pair of α helices separated by a tight turn (Figure 31.3). In complexes with DNA, the second of these two helices (often called the recognition helix) lies in the major groove, where amino acid side chains make contact with the edges of base pairs. In contrast, residues of the first helix participate primarily in contacts with the DNA backbone. Helix-turn-helix motifs are present on many proteins that bind DNA as dimers, and thus two of the units will be present, one on each monomer.

Figure 31.3: Helix-turn-helix motif. These structures show three sequence-specific DNA-binding proteins that interact with DNA through a helix-turn-helix motif (highlighted in yellow). Notice that, in each case, the helix-turn-helix units within a protein dimer are approximately 34 Å apart, corresponding to one full turn of DNA.
[Drawn from 1EFA, 1RUN, and 1TRO.pdb.]

Although the helix-turn-helix motif is the most commonly observed DNA-binding unit in prokaryotes, not all regulatory proteins bind DNA through such units. A striking example is provided by the E. coli methionine repressor (Figure 31.4). This protein binds DNA through the insertion of a pair of β strands into the major groove.

Figure 31.4: DNA recognition through β strands. A methionine repressor is shown bound to DNA. Notice that residues in β strands, rather than in α helices, participate in the crucial interactions between the protein and the DNA.
[Drawn from 1CMA.pdb.]