Domains Are Modules of Tertiary Structure

Distinct regions of protein structure are often referred to as domains. There are three main classes of protein domains: functional, structural, and topological. A functional domain is a region of a protein that exhibits a particular activity characteristic of that protein, usually even when isolated from the rest of the protein. For instance, a particular region of a protein may be responsible for its catalytic activity (e.g., a kinase domain that covalently adds a phosphate group to another molecule) or its binding ability (e.g., a DNA-binding domain or a membrane-binding domain). Functional domains are often identified experimentally by whittling down a protein to its smallest active fragment with the aid of proteases, enzymes that cleave one or more peptide bonds in a target polypeptide. Alternatively, the DNA encoding a protein can be modified so that when the modified DNA is used to generate a protein, only a particular region, or domain, of the full-length protein is made. Thus it is possible to determine if specific parts of a protein are responsible for particular activities exhibited by the protein. Indeed, functional domains are often also associated with corresponding structural domains.

77

A structural domain is a region about 40 or more amino acids in length, arranged in a single, stable, and distinct structure often comprising one or more secondary structures. Many structural domains can fold into their characteristic structures independently of the rest of the protein in which they are embedded. As a consequence, distinct structural domains can be linked together—sometimes by short or long spacers—to form a large multidomain protein. Each of the polypeptide chains in the trimeric flu virus hemagglutinin, for example, contains a globular domain and a fibrous domain (Figure 3-11a). Structural domains can be incorporated as modules into different proteins. The modular approach to protein architecture is particularly easy to recognize in large proteins, which tend to be mosaics of different domains that confer distinct activities and thus can perform different functions simultaneously. As many as 75 percent of the proteins in eukaryotes have multiple structural domains. Structural domains frequently are also functional domains in that they can have an activity independent of the rest of the protein.

image
FIGURE 3-11 Tertiary and quaternary levels of structure. The protein pictured here, hemagglutinin (HA), is found on the surface of the influenza virus. This long multimeric molecule has three identical subunits, each composed of two polypeptide chains, HA1 and HA2. (a) The tertiary structure of each HA subunit comprises the folding of its helices and strands into a compact structure that is 13.5 nm long and divided into two domains. The membrane-distal domain (silver) is folded into a globular conformation. The membrane-proximal domain (gold) has a fibrous, stemlike conformation owing to the alignment of two long α helices (cylinders) of HA2 with β strands in HA1. Short turns and longer loops, many of them at the surface of the molecule, connect the helices and strands in each chain. (b) The quaternary structure of HA is stabilized by lateral interactions between the long helices (cylinders) in the fibrous domains of the three subunits (gold, blue, and green), forming a triple-stranded coiled-coil stalk. Each of the distal globular domains in HA binds sialic acid (red) on the surface of target cells. Like many membrane proteins, HA contains several covalently linked carbohydrate chains (not shown).
[Data from S. J. Gamblin et al., 2004, Science 303:1838–1842, PDB ID 1ruz.]

The epidermal growth factor (EGF) domain is a structural domain that is present in several proteins (Figure 3-12). EGF is a small, soluble peptide hormone that binds to cells in the embryo and in skin and connective tissue in adults, causing them to divide. It is generated by proteolytic cleavage (breaking of a peptide bond) between repeated EGF domains in the EGF precursor protein, which is anchored in the plasma membrane by a membrane-spanning domain. EGF domains with sequences similar to, but not identical to, that of the EGF peptide hormone are present in other proteins and can be liberated by proteolysis. These proteins include tissue plasminogen activator (TPA), a protease that is used to dissolve blood clots in heart attack victims; Neu protein, which takes part in embryonic differentiation; and Notch protein, a receptor protein in the plasma membrane that functions in developmentally important signaling (see Chapter 16). Besides the EGF domain, these proteins have other domains in common with other proteins. For example, TPA possesses a trypsin domain, a functional domain found in some proteases. It is estimated that there are about a thousand different types of structural domains in all proteins. Some of these are not very common, whereas others are found in many different proteins. Indeed, by some estimates, only nine major types of structural domains account for as much as a third of all the structural domains in all proteins. Structural domains can be recognized in proteins whose structures have been determined by x-ray crystallography or nuclear magnetic resonance (NMR) analysis or in images captured by electron microscopy.

image
FIGURE 3-12 Modular nature of protein domains. Epidermal growth factor (EGF) is generated by proteolytic cleavage of a precursor protein containing multiple EGF domains (green) and a membrane-spanning domain (blue). An EGF domain is also present in the Neu protein and in tissue plasminogen activator (TPA). These proteins also contain other widely distributed domains, indicated by shape and color. See I. D. Campbell and P. Bork, 1993, Curr. Opin. Struc. Biol. 3:385.

78

Regions of proteins that are defined by their distinctive spatial relationships to the rest of the protein are topological domains. For example, some proteins associated with cell-surface membranes have a part extending inward into the cytoplasm (cytoplasmic domain), a part embedded within the phospholipid bilayer (membrane-spanning domain), and a part extending outward into the extracellular space (extracellular domain). Each of these parts can comprise one or more structural and functional domains.

In Chapter 8, we will consider the mechanism by which the gene segments that correspond to domains became shuffled in the course of evolution, resulting in their appearance in many proteins. Once a functional, structural, or topological domain has been identified and characterized in one protein, it is possible to use that information to search for similar domains in other proteins and to suggest potentially similar functions for those domains in those proteins.