Chapter Introduction

67

CHAPTER 3

Protein Structure and Function

image
Molecular ribbon model of a protein “needle” used by pathogenic bacteria to inject proteins into human cells to initiate infection. Many disease-causing bacteria, including Salmonella typhimurium (food poisoning) and Yersinia pestis (bubonic plague), use a syringe-like protein complex called a type III secretion system to inject proteins into their mammalian target cells. The structure of the needle portion of the syringe used by Salmonella typhimurium, determined using a combination of nuclear magnetic resonance (NMR), electron microscopy, and computational methods, is a long tube with many α helices (illustrated as coiled ribbons) forming the walls of the needle.
[Data from A. Loquet et al., 2012, Nature 486:276, PDB ID 2lpz.]

OUTLINE

3.1 Hierarchical Structure of Proteins

3.2 Protein Folding

3.3 Protein Binding and Enzyme Catalysis

3.4 Regulating Protein Function

3.5 Purifying, Detecting, and Characterizing Proteins

3.6 Proteomics

Proteins, which are polymers of amino acids, come in many sizes and shapes. Their three-dimensional diversity principally reflects variations in their lengths and amino acid sequences. In general, the linear, unbranched polymer of amino acids composing any protein will fold into only one or a few closely related three-dimensional shapes—called conformations. The conformation of a protein, together with the distinctive chemical properties of its amino acid side chains, determines its function. In some cases, the conformation, and thus the function, of a protein can change when that protein noncovalently or covalently associates with other molecules. Because of their many different shapes and chemical properties, proteins can perform a dazzling array of distinct functions inside and outside cells that either are essential for life or provide a selective evolutionary advantage to the cell or organism that contains them. It is, therefore, not surprising that characterizing the structures and activities of proteins is a fundamental prerequisite for understanding how cells work. Much of this textbook is devoted to examining how proteins act together to allow cells to live and function properly.

Although their structures are diverse, most proteins can be grouped into one of a few broad functional classes. Structural proteins, for example, determine the shapes of cells and their extracellular environments and serve as guide wires or rails to direct the intracellular movement of molecules and organelles. They are usually formed by the assembly of multiple protein subunits into very large, long structures. Scaffold proteins bring other proteins together into ordered arrays to perform specific functions more efficiently than those proteins would if they were not assembled together. Enzymes are proteins that catalyze chemical reactions. Membrane transport proteins permit the flow of ions and molecules across cellular membranes. Regulatory proteins act as signals, sensors, and switches to control the activities of cells by altering the functions of other proteins and genes. Regulatory proteins include signaling proteins, such as the hormones and cell-surface receptors that transmit extracellular signals to the cell interior. Motor proteins are responsible for moving other proteins, organelles, cells—even whole organisms. Any one protein can be a member of more than one protein class, as is the case with some cell-surface signaling receptors that are both enzymes and regulator proteins because they transmit signals from outside to inside cells by catalyzing chemical reactions. To accomplish their diverse missions efficiently, some proteins assemble into large complexes, often called molecular machines.

68

How do proteins perform so many diverse functions? They do so by exploiting a few simple activities. Most fundamentally, proteins bind—to one another, to other macromolecules such as DNA, and to small molecules and ions. In many cases, such binding induces a conformational change (a change in the three-dimensional structure) in the protein and thus influences its activity. Binding is based on molecular complementarity between a protein and its binding partner, as described in Chapter 2. A second key activity is enzymatic catalysis. Appropriate folding of a protein will place some amino acid side chains and some carboxyl and amino groups of its backbone into positions that permit the catalysis of covalent bond rearrangements. A third activity is folding into a channel or pore within a membrane through which molecules and ions can flow. Although these are especially crucial protein activities, they are not the only ones. For example, fish that live in frigid waters—the Antarctic borchs and Arctic cods—have antifreeze proteins in their circulatory systems to prevent water crystallization.

A complete understanding of how proteins permit cells to live and thrive requires the identification and characterization of all the proteins used by a cell. In a sense, molecular cell biologists want to compile a complete protein “parts list” and construct a “user’s manual” that describes how these proteins work. Compiling a comprehensive inventory of proteins has become feasible in recent years with the sequencing of the entire genomes—complete sets of genes—of more and more organisms. From a computer analysis of a genome’s sequence, researchers can deduce the amino acid sequences and approximate number of the proteins it encodes (see Chapter 6). The term proteome was coined to refer to the entire protein complement of an organism. The human genome contains some 20,000–23,000 genes that encode proteins. However, variations in mRNA production, such as alternative splicing (see Chapter 10), and more than a hundred types of protein modifications may generate hundreds of thousands of distinct human proteins. By comparing the sequences and structures of proteins of unknown function with those of proteins of known function, scientists can often deduce much about what the unknown proteins do. In the past, characterization of protein function by genetic, biochemical, or physiological methods often preceded the identification of particular proteins. In the modern genomic and proteomic era, a protein is usually identified before its function is determined.

In this chapter, we begin our study of how the structure of a protein gives rise to its function, a theme that recurs throughout this book (Figure 3-1). The first section examines how linear chains of amino acid building blocks are arranged in a three-dimensional structural hierarchy. The next section discusses how proteins fold into these structures. We then turn to protein function, focusing on enzymes, those proteins that catalyze chemical reactions. Various mechanisms that cells use to control the activities and life spans of proteins are covered next. The chapter concludes with a discussion of commonly used techniques for identifying, isolating, and characterizing proteins, and a discussion of the burgeoning field of proteomics.

69

image
FIGURE 3-1 Overview of protein structure and function. (a) Proteins have a hierarchical structure. A polypeptide’s linear sequence of amino acids linked by peptide bonds (primary structure) folds into local helices or sheets (secondary structure) that pack into a complex three-dimensional shape (tertiary structure). Some individual polypeptides associate into multichain complexes (quaternary structure), which in some cases can be very large, consisting of tens to hundreds of subunits (supramolecular complexes). (b) Proteins perform numerous functions, including organizing the genome, organelles, cytoplasm, protein complexes, and membranes in three-dimensional space (structure); controlling protein activity (regulation); monitoring the environment and transmitting information (signaling); moving small molecules and ions across membranes (transport); catalyzing chemical reactions (via enzymes); and generating force for movement (via motor proteins). These functions and others arise from specific binding interactions and conformational changes in the structure of a properly folded protein.