18.6 PROTEIN FOLDING, COVALENT MODIFICATION, AND TARGETING

To achieve their biologically active forms, new polypeptides must fold into the correct three-dimensional conformation. Before or after folding, the new polypeptide might undergo several types of processes: enzymatic processing, including the removal of one or more amino acids (usually from the N-terminus); addition of acetyl, phosphoryl, methyl, carboxyl, or other groups to certain amino acid residues; proteolytic cleavage; and/or attachment of oligosaccharides or prosthetic groups. In this way, the linear, or one-dimensional, genetic message in the mRNA is converted into the three-dimensional structure of the protein.

Protein Folding Sometimes Requires the Assistance of Chaperones

Most proteins fold during translation (after emerging from the ribosome) or immediately after translation, typically beginning with the formation of local secondary structures, including α helices and β sheets. In a cooperative process, these secondary structural elements then interact, often through hydrophobic interactions, to produce the stable three-dimensional structure of the active protein. In many cases, proteins called chaperones catalyze local unfolding and refolding of polypeptide chains to enhance the rate and accuracy of overall folding. By mechanisms sometimes coupled to ATP hydrolysis, chaperones bind transiently to hydrophobic protein segments during folding to ensure that interactions occur in the proper order (see Figures 4-23 and 4-24).

After folding into their native conformations, some proteins form intrachain or interchain disulfide bonds, or bridges, between Cys residues. In eukaryotes, disulfide bonds are common in proteins to be exported from cells. The cross-links formed in this way help protect the molecule’s native conformation from denaturation in the extracellular environment, which can differ greatly from the intracellular environment and is generally oxidizing (see the How We Know section at the end of Chapter 4).

Covalent Modifications Are Common in Newly Synthesized Proteins

Some newly made proteins, both bacterial and eukaryotic, do not attain their final biologically active conformation until they have been altered by one or more processing reactions known as posttranslational modifications. As we have seen, the first residue inserted in all polypeptides is N-formylmethionine (in bacteria) or methionine (in eukaryotes). The formyl group, the N-terminal Met residue, and often additional N-terminal (and, in some cases, C-terminal) residues may be removed enzymatically in forming the final functional protein. In as many as 50% of eukaryotic proteins, the amino group of the N-terminal residue is N-acetylated after translation. Carboxyl-terminal residues are also sometimes modified. Scientists are still learning the rules to determine which proteins will have which amino acids removed or modified.

The 15 to 30 residues at the N-terminal end of some proteins play a role in directing the protein to its ultimate destination in the cell. After this protein trafficking, these residues are often removed by specific peptidases.

Individual amino acid residues can be modified, either permanently or transiently, with significant effects on functionality, increasing or decreasing the protein’s ability to bind other molecules (Figure 18-34). The hydroxyl groups of certain Ser, Thr, and Tyr residues of some proteins are enzymatically phosphorylated by ATP. Extra carboxyl groups may be added to Glu residues of some proteins, and Lys, Arg, or Glu residues can be methylated.

Figure 18-34: Modified amino acid residues. Amino acid residues can be phosphorylated, carboxylated, or methylated to alter protein function.

654

The carbohydrate side chains of glycoproteins are attached covalently during or after synthesis of the polypeptide. In some glycoproteins, the carbohydrate side chain is attached enzymatically to Asn residues (N-linked oligosaccharides), in others to Ser or Thr residues (O-linked oligosaccharides). Many proteins that function extracellularly, as well as the proteoglycans that coat and lubricate mucous membranes, contain oligosaccharide side chains.

Other covalent modifications to proteins include the addition of isoprenyl groups or prosthetic groups and cleavage by proteases. Many bacterial and eukaryotic proteins require covalently bound prosthetic groups for their activity. Finally, many proteins are initially synthesized as large, inactive precursor polypeptides that are proteolytically trimmed to their smaller, active forms.

Proteins Are Targeted to Correct Locations during or after Synthesis

Günter Blobel

Cells are made up of many structures and compartments, and, in the case of eukaryotic cells, they contain organelles, each with specific functions that require distinct sets of proteins and enzymes. These proteins (with the exception of those produced in mitochondria and chloroplasts) are synthesized on ribosomes in the cytosol or on the endoplasmic reticulum (ER). How are proteins directed to their final cellular destinations? Proteins destined for secretion, integration into the plasma membrane, or inclusion in lysosomes generally share the first few steps of a pathway that begins in the ER. Proteins destined for mitochondria, chloroplasts, or the nucleus use three separate mechanisms. And proteins destined for the cytosol simply remain where they are synthesized.

The most important element in many of these targeting pathways is a short sequence of amino acids called a signal sequence, the function of which was first postulated by Günter Blobel and his colleagues in 1970. The signal sequence directs a protein to its appropriate location in the cell and, for many proteins, is removed during transport or after arrival at its final destination. In proteins directed to mitochondria, chloroplasts, or the ER, the signal sequence is at the N-terminus of the newly synthesized polypeptide. Some signal sequences promote transport to the ER or mitochondria, others to the nucleus. The strength of a signal sequence, like that of a promoter sequence, depends on how similar it is to an unknown, hypothetical ideal sequence. In many cases, the targeting capacity of a particular signal sequence has been confirmed by fusing the sequence from one protein to a second protein and showing that the signal directs the second protein to the location where the first protein is normally found.

Some Chemical Modifications of Eukaryotic Proteins Take Place in the Endoplasmic Reticulum

The best-characterized targeting system begins in the ER. Most lysosomal, membrane, or secreted proteins are synthesized on ribosomes attached to the ER. These proteins have an N-terminal signal sequence that marks them for translocation into the ER or its lumen; hundreds of such signal sequences have been determined (Figure 18-35). Signal sequences vary in length from 13 to 36 amino acid residues, but all contain 10 to 15 hydrophobic residues, one or more positively charged residues near the N-terminus, and a short polar sequence near the cleavage site (for eventual removal of the signal) at the C-terminus.

Figure 18-35: Signal sequences. Just a few of the hundreds of known signal sequences are shown here. Hemagglutinin is a transmembrane protein on the surface of the influenza virus. Lipoproteins are components of plasma membranes and organelle membranes; they also transport fats in the bloodstream. Proteins destined for secretion include growth hormone, the bee venom toxin melittin (as its precursor promelittin), a Drosophila glue protein used in forming the pupa, and insulin (as its precursor, preproinsulin). Hydrophobic residues are shown in yellow; charged residues are in blue.

655

The signal sequence itself helps direct the ribosome to the ER, as shown in Figure 18-36. The targeting pathway begins with initiation of protein synthesis on cytosolic ribosomes (step 1). The signal sequence forms early in the synthesis process: it is at the N-terminus, which is synthesized first. As the signal sequence emerges from the ribosome (step 2), the signal and the ribosome itself are bound by the large signal recognition particle (SRP) (step 3). The SRP, a multisubunit complex that includes the essential 7SL RNA component, binds GTP (step 4) and halts elongation of the polypeptide when it is about 70 amino acids long and the signal sequence has completely emerged from the ribosome. The GTP-bound SRP directs the ribosome (still bound to the mRNA) and the incomplete polypeptide to GTP-bound SRP receptors in the cytosolic face of the ER. GTP is hydrolyzed, enabling delivery of the nascent polypeptide to a peptide translocation complex in the ER, which may interact directly with the ribosome (step 5). SRP dissociates from the ribosome, accompanied by hydrolysis of the GTP bound to the SRP and to the SRP receptor. Elongation of the polypeptide resumes (step 6), with the ATP-driven translocation complex feeding the growing polypeptide into the ER lumen until the complete protein has been synthesized. The signal sequence is removed by a signal peptidase in the ER lumen (step 7). The ribosome dissociates (step 8) and is recycled (step 9).

Figure 18-36: Trafficking of proteins from the cytosol into the ER. The signal sequence of the nascent polypeptide is bound by SRP, which targets the elongating protein to the ER lumen. After the polypeptide has been synthesized, the ribosomal subunits dissociate and are recycled.

Glycosylation Plays a Key Role in Eukaryotic Protein Targeting

In the ER lumen, newly synthesized proteins are further modified in several ways. Following removal of signal sequences, polypeptides are folded, disulfide bonds are formed, and many proteins are glycosylated. In many glycoproteins, Asn residues are N-linked to a wide variety of oligosaccharides, but the pathways by which they form share common steps. Several antibiotics, including tunicamycin, act by interfering with this process and have aided in elucidating the steps of protein glycosylation. A few proteins are O-glycosylated in the ER, but most O-glycosylation occurs in the Golgi complex (Golgi apparatus) or in the cytosol (for proteins that do not enter the ER).

Once a protein is suitably modified, it can move to its final intracellular destination. Proteins travel from the ER to the Golgi complex in transport vesicles (Figure 18-37). In the Golgi complex, some proteins, as noted above, are O-glycosylated, and some N-linked oligosaccharides are further modified. By mechanisms not yet fully understood, the Golgi complex also sorts proteins and sends them to their final destinations. The processes that segregate proteins targeted for secretion from those targeted for the plasma membrane or lysosomes must distinguish among these proteins on the basis of structural features other than signal sequences, which were removed in the ER lumen.

Figure 18-37: Movement of proteins destined for membranes or secretion. After synthesis on ribosomes of the rough ER (see Figure 18-1) and targeting to the ER lumen, proteins travel in transport vesicles through the Golgi complex. Within the Golgi complex, the proteins may be further modified before they are sorted and shipped to their final destinations in secretory granules (for export from the cell) or transport vesicles (for further transport within the cell).

656

The pathways that target proteins to mitochondria and chloroplasts also rely on N-terminal signal sequences. Although mitochondria and chloroplasts contain DNA, most of their proteins are encoded by nuclear DNA and must be targeted to the appropriate organelle. Unlike other targeting pathways, however, the mitochondrial and chloroplast pathways begin only after protein synthesis is complete. Cytosolic chaperone proteins bind to precursor proteins and deliver them to receptors on the exterior surface of the target organelle. Specialized translocation mechanisms then transport each protein to its final destination in the organelle, after which the signal sequence is removed.

Signal Sequences for Nuclear Transport Are Not Removed

Many proteins and nucleic acids move into and out of the nucleus through nuclear pores. RNA molecules synthesized in the nucleus are exported to the cytosol for translation (see Chapter 16). Ribosomal proteins synthesized on cytosolic ribosomes are imported into the nucleus and assembled into 60S and 40S ribosomal subunits in the nucleolus, where the rRNAs are produced. Completed subunits are then exported back to the cytosol. A variety of nuclear proteins are synthesized in the cytosol and imported into the nucleus (e.g., RNA and DNA polymerases, histones, topoisomerases, and proteins that regulate gene expression). All of this traffic is modulated by a complex system of molecular signals and transport proteins.

In multicellular eukaryotes, cell division poses a problem for nuclear proteins. At each cell division, the nuclear envelope breaks down, and after division is completed and the nuclear envelope re-forms, the dispersed nuclear proteins must be reimported. To allow this repeated nuclear importation, the signal sequence that targets a protein to the nucleus—the nuclear localization sequence (NLS)—must remain on the protein after it arrives at its destination. An NLS, unlike other signal sequences, may be located almost anywhere along the primary sequence of the protein. The amino acid sequences of NLSs can vary considerably, but many consist of 4 to 8 residues and include several consecutive basic residues (Arg or Lys).

Nuclear import is mediated by proteins that cycle between the cytosol and the nucleus, including importins α and β and the Ran GTPase (Figure 18-38), in a mechanism like that discussed for nuclear RNA export and import in Chapter 16. A heterodimer of importins α and β functions as a carrier for cargo proteins targeted to the nucleus, with the α subunit binding cargo proteins in the cytosol. The importin-cargo complex docks at a nuclear pore and is translocated through the pore by an energy-dependent mechanism that requires the Ran GTPase. Once inside the nucleus, interaction with Ran-GTP triggers a change in the conformation of importin that leads to release of the cargo protein. The importin-Ran-GTP complex then passes through the nuclear pore back into the cytosol. Here, the Ran-binding protein binds to Ran and releases importin, and a GTPase-activating protein stimulates conversion of Ran-GTP to Ran-GDP.

Figure 18-38: Targeting of nuclear proteins. Importins bind a nuclear protein through its nuclear localization sequence (NLS) and transport the protein through a nuclear pore, with the help of Ran-GDP. In the nucleus, Ran exchanges its GDP for GTP, facilitating release of the nuclear protein. The importins and Ran-GTP then shuttle back to the cytoplasm through a nuclear pore.

657

Some proteins contain a nuclear export signal (NES), often consisting of four hydrophobic residues that target the protein for export from the cell nucleus to the cytoplasm through the nuclear pore complex. An NES has the opposite effects of an NLS and is recognized by proteins called exportins. The most common spacing of the hydrophobic amino acids of an NES is L-X-X-X-L-X-X-L-X-L, where L is a hydrophobic amino acid (typically leucine, hence the L) and X is any other amino acid. The spacing probably reflects the structure of this signal that is required for binding to exportin complexes.

Bacteria Also Use Signal Sequences for Protein Targeting

Bacteria can target proteins to their inner or outer membranes, to the periplasmic space, or to the extracellular medium. They use signal sequences at the N-terminus of the proteins, much like those on eukaryotic proteins targeted to the ER, mitochondria, and chloroplasts.

Most proteins exported from E. coli make use of the pathway shown in Figure 18-39. Following translation, the N-terminal signal sequence may impede folding of a protein to be exported. A soluble chaperone protein, SecB, binds to the signal sequence or to other features of the protein’s incompletely folded structure. The bound protein is then delivered to SecA, a protein associated with the inner surface of the plasma membrane. SecA acts as both a receptor and a translocating ATPase. Released from SecB and bound to SecA, the protein is delivered to a translocation complex in the membrane, made up of SecY, SecE, and SecG, and is translocated stepwise through the membrane at the SecYEG complex, about 20 amino acid residues at a time. Each step is facilitated by the hydrolysis of ATP, catalyzed by SecA. Although most exported bacterial proteins use this pathway, some follow an alternative route that uses signal recognition and receptor proteins homologous to components of the eukaryotic SRP and SRP receptor.

Figure 18-39: A model for protein export in bacteria. A partially folded protein with a signal sequence is bound by SecB and then transferred to SecA and SecYEG, the latter a component of the bacterial plasma membrane. SecYEG pushes the protein through the membrane stepwise, and the protein folds on the other side of the membrane, in the periplasmic space.

658

SECTION 18.6 SUMMARY

  • Polypeptides fold into their active, three-dimensional forms during or immediately after synthesis, often with the help of ATP-dependent chaperone proteins. Many proteins are further processed by posttranslational modification reactions that add functional groups, such as phosphates or sugars.

  • During or immediately following synthesis, many proteins are directed to specific cellular locations. One targeting mechanism involves a peptide signal sequence, generally at the N-terminus of a newly synthesized protein.

  • In eukaryotic cells, one class of signal sequences is recognized by the signal recognition particle, which binds the signal sequence as soon as it appears on the ribosome and transfers the entire ribosome and incomplete polypeptide to the ER. The peptides are moved into the ER lumen, where they may be modified and moved to the Golgi complex, then sorted and sent to lysosomes, the plasma membrane, or transport vesicles.

  • Proteins targeted to mitochondria and chloroplasts, and those destined for export in bacterial cells, also make use of an N-terminal signal sequence. Specific enzymes remove the signal once the protein reaches its destination.

  • Nuclear localization signals are not removed, because nuclear proteins must be relocalized to the nucleus each time the cell divides. Protein import requires a nuclear localization signal (NLS), importins, the Ran GTPase, and GTP. Protein export also requires a nuclear export signal (NES) and exportins.

  • Bacterial proteins may be targeted to the plasma membrane by a signal sequence.

UNANSWERED QUESTIONS

Although many details of bacterial protein synthesis are known in exquisite detail, researchers have yet to determine how eukaryotic translation is initiated and regulated, how translation assists in protein folding, and how ribosomes work together within polysomes.

  1. How is the initiation of eukaryotic translation regulated? Translation initiation is much more complex in eukaryotic cells than in bacterial cells. It is critical to understand how the eukaryotic system works, because much protein synthesis regulation occurs at the level of initiation. Molecular structures of the eukaryotic ribosome, coupled with more detailed biochemical studies, will help reveal the details of this process.

  2. How is translation rate coupled to protein folding? The physics and kinetics of translation clearly affect how proteins fold. For example, many mRNAs include rare codons for which there are few available matching tRNAs, and these serve to stall ribosomes and hence provide time for proteins to fold. Understanding how this works is important for determining how proteins attain their correct structure in cells. Forces produced by ribosomes as they traverse an mRNA may also be important for melting RNA structures that could otherwise impede translation.

  3. How do the crowded conditions inside cells influence translation rates and accuracy? Most of the experiments that have probed translation mechanisms have been performed using purified ribosomes under relatively dilute conditions (∼1 μm). In rapidly growing cells, however, ribosomes may be present at up to 100 times this concentration. Computer simulations that model the process of translation in the presence of various cellular factors may help determine the impact of molecular crowding on the rate of protein synthesis.

659

HOW WE KNOW: The Ribosome Is a Ribozyme

Noller, H.F., and J.B. Chaires. 1972. Functional modification of 16S ribosomal RNA by kethoxal. Proc. Natl. Acad. Sci. USA 69:3115–3118.

Noller H.F., V. Hoffarth, and L. Zimniak. 1992. Unusual resistance of peptidyl transferase to protein extraction procedures. Science 256:1416–1419.

A series of experiments conducted in the early 1970s by Harry Noller provided the first evidence that ribosomal RNA, rather than ribosomal protein, is responsible for catalyzing peptide bond formation during protein synthesis. Using chemicals that react with side chains in proteins or nucleotides, Noller discovered that the 30S ribosomal subunit of bacteria could be inactivated by a reagent called kethoxal, which primarily attacks guanosine nucleotides in RNA.

Bacterial ribosomes were purified and separated into their two subunits. After addition of kethoxal to the 30S subunits, these were mixed with unmodified 50S subunits and the ribosomes tested for the ability to stimulate in vitro protein synthesis, using tRNA and a poly(U) mRNA template. In contrast to samples with untreated 30S subunits, the kethoxal-treated sample rapidly lost activity (Figure 1). Analysis of the inactivated ribosomes showed that modification of just six G nucleotides in the rRNA was sufficient to inhibit the peptidyl transferase reaction. Inactivation was slower in the presence of bound tRNA, however, leading to the conclusion that kethoxal interferes with protein synthesis by modifying the binding site for tRNA on the 30S subunit.

FIGURE 1 Chemical modification of rRNA inactivates the 30S ribosomal subunit. The graph shows the percentage of normal activity (synthesis of polypeptide) over time. In the absence of kethoxal (squares), the 30S subunit retained activity; in the presence of kethoxal (circles), activity was impaired over time. A control reaction lacking tRNA and mRNA had virtually no activity (triangles). The chemical structure of a kethoxal-modified G nucleoside is shown on the right.

In 1992, Noller and his colleagues published a study showing that virtually all the r-proteins could be removed from the ribosome with only small effects on protein-synthesizing activity, whereas damage to the rRNA destroyed the ribosome’s catalytic properties. In 2000, the high-resolution crystal structure of the 50S ribosomal subunit revealed that the active site responsible for peptide bond formation is composed entirely of rRNA. Thus, the structural data confirmed what had long been suspected based on biochemical evidence: the ribosome is a ribozyme.

660

HOW WE KNOW: Ribosomes Check the Accuracy of Codon-Anticodon Pairing, but Not the Identity of the Amino Acid

Chapeville, F., F. Lipmann, G. Von Ehrenstein, B. Weisblum, W.J. Ray Jr., and S. Benzer. 1962. On the role of soluble ribonucleic acid in coding for amino acids. Proc. Natl. Acad. Sci. USA 48:1086–1092.

Zaher, H.S., and R. Green. 2009. Quality control by the ribosome following peptide bond formation. Nature 457:161–166.

Rachel Green

Translation relies on aminoacyl-tRNA synthetases to ensure the correct charging of tRNAs, because the ribosome does not discriminate between correctly and incorrectly charged tRNAs during protein synthesis. A classic experiment by Seymour Benzer and his colleagues demonstrated that if a tRNA is aminoacylated with the wrong amino acid, this incorrect amino acid is efficiently incorporated into a protein in response to the codon normally recognized by that tRNA.

This experiment, performed in 1962, used 14C labeling to track how amino acids attached to particular tRNAs were incorporated into polypeptides. For example, correctly charged [14C]Cys-tRNACys was chemically treated with Raney nickel (a nickel-aluminum alloy catalyst) to form a mischarged [14C]Ala-tRNACys (Figure 2a). Poly(UG) mRNA has UGU codons that, with wobble base pairing in the third position, match the ACG anticodon for tRNACys, but has no codons for alanine. Using this mRNA, the researchers found that the ribosome efficiently incorporated [14C]alanine into acid-insoluble polypeptide (Figure 2b). This result was an elegant and straightforward demonstration that the ribosome does not proofread tRNAs for correct aminoacylation.

FIGURE 2 The ribosome does not proofread aminoacylated tRNAs. (a) The 14C-labeled Cys residue of Cys-tRNACys is converted to an Ala residue by Raney nickel, which removes the sulfur from cysteine. (b) Polypeptide synthesis with and without poly(UG) mRNA, using correctly charged [14C]Cys-tRNACys, or correctly charged [14C]Ala-tRNAAla (as control), or mischarged [14C]Ala-tRNACys. The results in the presence of UGU codons (dark purple columns on the right) show incorporation of labeled Cys with Cys-tRNACys (left); negligible incorporation of labeled Ala with Ala-tRNAAla (middle); and incorporation of labeled Ala with Ala-tRNACys (right).

For many years, protein synthesis was thought to rely on the combined accuracy of tRNA aminoacylation and aminoacyl-tRNA selection by the ribosome in cooperation with the GTPase elongation factor EF-Tu (see Section 18.4). These two processes operate before peptide bond formation to ensure that only correctly charged and correctly matched tRNAs enter the ribosomal A site.

More recently, an additional mechanism occurring after peptidyl transfer was found to contribute to the accuracy of protein synthesis. Using a well-defined in vitro bacterial translation system, Rachel Green and Hani Zaher showed that incorporation of an amino acid from a mismatched aminoacyl-tRNA into the elongating polypeptide leads to a general loss of specificity in the ribosomal A site. The resulting propagation of errors leads to early termination of protein synthesis, avoiding the production of a complete protein containing incorrect amino acids.

661