9.1 Protein Structure

When a primary transcript has been fully processed into a mature mRNA molecule, translation into protein can take place. Before considering how proteins are made, we need to understand protein structure. Proteins are the main determinants of biological form and function. These molecules heavily influence the shape, color, size, behavior, and physiology of organisms. Because genes function by encoding proteins, understanding the nature of proteins is essential to understanding gene action.

A protein is a polymer composed of monomers called amino acids. In other words, a protein is a chain of amino acids. Because amino acids were once called peptides, the chain is sometimes referred to as a polypeptide. Amino acids all have the general formula

All amino acids have two functional groups (the carboxyl and amino, shown above) bonded to the same carbon atom (called the α carbon). Also attached to the α carbon are an H atom and a side chain, or R (reactive) group. There are 20 amino acids known to exist in proteins, each having a different R group that gives the amino acid its unique properties. The side chain can be anything from a hydrogen atom (as in the amino acid glycine) to a complex ring (as in the amino acid tryptophan). In proteins, the amino acids are linked together by covalent bonds called peptide bonds. A peptide bond is formed by the linkage of the amino end (NH2) of one amino acid with the carboxyl end (COOH) of another amino acid (Figure 9-2). One water molecule is removed during the reaction. Because of the way in which the peptide bond forms, a polypeptide chain always has an amino end (NH2) and a carboxyl end (COOH), as shown in Figure 9-2a.

Figure 9-2: The peptide bond
Figure 9-2: (a) A polypeptide is formed by the removal of water between amino acids to form peptide bonds. Each aa indicates an amino acid. R1, R2, and R3 represent R groups (side chains) that differentiate the amino acids. (b) The peptide bond is a rigid planar unit with the R groups projecting out from the C–N backbone. Standard bond distances (in angstroms) are shown.

ANIMATED ART: Translation: peptide-bond formation

Proteins have a complex structure that has four levels of organization, illustrated in Figure 9-3. The linear sequence of the amino acids in a polypeptide chain con stitutes the primary structure of the protein. Local regions of the polypeptide chain fold into specific shapes, called the protein’s secondary structure. Each shape arises from the bonding forces between amino acids that are close together in the linear sequence. These forces include several types of weak bonds, notably hydrogen bonds, electrostatic forces, and van der Waals forces. The most common secondary structures are the α helix and the β-pleated sheet. Different proteins show either one or the other or sometimes both within their structures. Tertiary structure is produced by the folding of the secondary structure. Some proteins have quaternary structure: such a protein is composed of two or more separate folded polypeptides, also called subunits, joined by weak bonds. The quaternary association can be between different types of polypeptides (resulting in a heterodimer if there are two subunits) or between identical polypeptides (making a homodimer). Hemoglobin is an example of a heterotetramer, a four-subunit protein; it is composed of two copies each of two different polypeptides, shown in green and purple in Figure 9-3d.

Figure 9-3: Levels of protein structure
Figure 9-3: A protein can have four levels of structure. (a) Primary structure. The sequence of amino acids defined by their R groups. (b) Secondary structure. The polypeptide can form a helical structure (an α helix) or a zigzag structure (a β-pleated sheet). The β-pleated sheet has two polypeptide segments arranged in opposite polarity, as indicated by the arrows. (c) Tertiary structure. The heme group is a nonprotein ring structure with an iron atom at its center. (d) Quaternary structure illustrated by hemoglobin, which is composed of four polypeptide subunits: two a subunits and two β subunits.

323

324

Many proteins are compact structures; they are called globular proteins. Enzymes and antibodies are among the best-known globular proteins. Proteins with linear shape, called fibrous proteins, are important components of such structures as skin, hair, and tendons.

Shape is all-important to a protein because a protein’s specific shape enables it to do its specific job in the cell. A protein’s shape is determined by its primary amino acid sequence and by conditions in the cell that promote the folding and bonding necessary to form higher-level structures. The folding of proteins into their correct conformation will be discussed at the end of this chapter. The amino acid sequence also determines which R groups are present at specific positions and thus available to bind with other cellular components. The active sites of enzymes are good illustrations of the precise interactions of R groups. Each enzyme has a pocket called the active site into which its substrate or substrates can fit. Within the active site, the R groups of certain amino acids are strategically positioned to interact with a substrate and catalyze a specific chemical reaction.

At present, the rules by which primary structure is converted into higher-level structure are imperfectly understood. However, from knowledge of the primary amino acid sequence of a protein, the functions of specific regions can be predicted. For example, some characteristic protein sequences are the contact points with membrane phospholipids that position a protein in a membrane. Other characteristic sequences act to bind the protein to DNA. Amino acid sequences or protein folds that are associated with particular functions are called domains. A protein may contain one or more separate domains.