4.3 Tertiary Structure: Water-Soluble Proteins Fold into Compact Structures
As already discussed, primary structure is the sequence of amino acids, and secondary structure is the simple repeating structures formed by hydrogen bonds between hydrogen and oxygen atoms of the peptide backbone. Another level of structure, tertiary structure, refers to the spatial arrangement of amino acid residues that are far apart in the sequence and to the pattern of disulfide bonds. This level of structure is the result of interactions between the R groups of the peptide chain. To explore the principles of tertiary structure, we will examine myoglobin, the first protein to be seen in atomic detail.
Myoglobin Illustrates the Principles of Tertiary Structure
Myoglobin is an example of a globular protein (Figure 4.25). In contrast with fibrous proteins such as keratin, globular proteins have a compact three-dimensional structure and are water soluble. Globular proteins, with their more intricate three-dimensional structure, perform most of the chemical transactions in the cell.
Figure 4.25: The three-dimensional structure of myoglobin. (A) a ribbon diagram shows that the protein consists largely of α helices. (B) A space-filling model in the same orientation shows how tightly packed the folded protein is. Notice that the heme group is nestled into a crevice in the compact protein with only an edge exposed. One helix is blue to allow comparison of the two structural depictions.
Myoglobin, a single polypeptide chain of 153 amino acids, is an oxygen-binding protein found predominantly in heart and skeletal muscle; it appears to facilitate the diffusion of oxygen from the blood to the mitochondria, the primary site of oxygen utilization in the cell. The capacity of myoglobin to bind oxygen depends on the presence of heme, a prosthetic (helper) group containing an iron atom. Myoglobin is an extremely compact molecule. Its overall dimensions are 45 × 35 × 25 Å, an order of magnitude less than if it were fully stretched out. About 70% of the main chain is folded into eight α helices, and much of the rest of the chain forms turns and loops between helices.
Myoglobin, like most other proteins, is asymmetric because of the complex folding of its main chain. A unifying principle emerges from the distribution of side chains. The striking fact is that the interior consists almost entirely of nonpolar residues (Figure 4.26). The only polar residues on the interior are two histidine residues, which play critical roles in binding the heme iron and oxygen. The outside of myoglobin, on the other hand, consists of both nonpolar and polar residues, which can interact with water and thus render the molecule water soluble. The space-filling model shows that there is very little empty space inside.
Figure 4.26: The distribution of amino acids in myoglobin. (A) A space-filling model of myoglobin, with hydrophobic amino acids shown in yellow, charged amino acids shown in blue, and others shown in white. Notice that the surface of the molecule has many charged amino acids, as well as some hydrophobic amino acids. (B) In this cross-sectional view, notice that mostly hydrophobic amino acids are found on the inside of the structure, whereas the charged amino acids are found on the protein surface.
This contrasting distribution of polar and nonpolar residues reveals a key facet of protein architecture. In an aqueous environment such as the interior of a cell, protein folding is driven by the hydrophobic effect—the strong tendency of hydrophobic residues to avoid contact with water. The polypeptide chain therefore folds so that its hydrophobic side chains are buried and its polar, charged chains are on the surface. Similarly, an unpaired peptide NH or CO group of the main chain markedly prefers water to a nonpolar milieu. The only way to bury a segment of main chain in a hydrophobic environment is to pair all the NH and CO groups by hydrogen bonding. This pairing is neatly accomplished in an α helix or β sheet. Van der Waals interactions between tightly packed hydrocarbon side chains also contribute to the stability of proteins. We can now understand why the set of 20 amino acids contains several that differ subtly in size and shape. They provide a palette of shapes that can fit together tightly to fill the interior of a protein neatly and thereby maximize van der Waals interactions, which require intimate contact.
Some proteins that span biological membranes are “the exceptions that prove the rule” because they have the reverse distribution of hydrophobic and hydrophilic amino acids. For example, consider porins, proteins found in the outer membranes of many bacteria. Membranes are built largely of the hydrophobic hydrocarbon chains of lipids (Chapter 12). Thus, porins are covered on the outside largely by hydrophobic residues that interact with the hydrophobic environment. In contrast, the center of the protein contains many charged and polar amino acids that surround a water-filled channel running through the middle of the protein. Thus, because porins function in hydrophobic environments, they are “inside out” relative to proteins that function in aqueous solution.
The Tertiary Structure of Many Proteins Can Be Divided into Structural and Functional Units
Figure 4.27: The helix-turn-helix motif, a supersecondary structural element. Helix-turn-helix motifs are found in many DNA-binding proteins.
Certain combinations of secondary structure are present in many proteins and frequently exhibit similar functions. These combinations are called motifs or supersecondary structures. For example, an α helix separated from another α helix by a turn, called α helix-turn-helix unit, is found in many proteins that bind DNA (Figure 4.27).
Some polypeptide chains fold into two or more compact regions that may be connected by a flexible segment of polypeptide chain, rather like pearls on a string. These compact globular units, called domains, range in size from about 30 to 400 amino acid residues. For example, the extracellular part of CD4, a cell-surface protein on certain cells of the immune system, comprises four similar domains of approximately 100 amino acids each (Figure 4.28). Different proteins may have domains in common even if their overall tertiary structures are different.
Figure 4.28: Protein domains. The cell-surface protein CD4 consists of four similar domains.