Chapter Introduction

167

CHAPTER 5

Fundamental Molecular Genetic Mechanisms

image
Colored transmission electron micrograph of one ribosomal RNA transcription unit from a Xenopus oocyte. Transcription proceeds from left to right, with nascent ribosomal ribonucleoprotein complexes (rRNPs) growing in length as each successive RNA polymerase I molecule moves along the DNA template at the center. In this preparation, each rRNP is oriented either above or below the central strand of DNA being transcribed, so that the overall shape is similar to a feather. In the nucleolus of a living cell, the nascent rRNPs extend in all directions, like a bottlebrush.
[Professor Oscar L. Miller/Science Photo Library.]

OUTLINE

5.1 Structure of Nucleic Acids

5.2 Transcription of Protein-Coding Genes and Formation of Functional mRNA

5.3 The Decoding of mRNA by tRNAs

5.4 Stepwise Synthesis of Proteins on Ribosomes

5.5 DNA Replication

5.6 DNA Repair and Recombination

5.7 Viruses: Parasites of the Cellular Genetic System

The extraordinary versatility of proteins as the components of cellular structures, cellular catalysts, and molecular switches and machines was described in Chapter 3. In this chapter, we consider the process by which proteins are made as well as other cellular processes that are critical for the survival of an organism and its descendants. Our focus will be on the vital molecules known as nucleic acids, and how they ultimately are responsible for governing all cellular function. As we saw in Chapter 2, nucleic acids are linear polymers of four types of nucleotides (see Figures 2-13, 2-16, and 2-17). These macromolecules (1) contain in the precise sequence of their nucleotides the information for determining the amino acid sequence, and hence the structure and function, of all the proteins of a cell; (2) are critical functional components of the cellular macromolecular factories that select amino acids and align them in the correct order as a polypeptide chain is being synthesized; (3) catalyze a number of fundamental chemical reactions in cells, including formation of peptide bonds between amino acids during protein synthesis; and (4) regulate the expression of genes.

Deoxyribonucleic acid (DNA) is an informational molecule that contains in the sequence of its nucleotides the information required to build all the proteins and RNAs of an organism, and hence the cells and tissues of that organism. Chemically, it is ideally suited to perform this function. It is extraordinarily stable under most terrestrial conditions, as exemplified by our ability to recover DNA sequence from bones and tissues that are tens of thousands of years old. Because of this, and because of the repair mechanisms that operate in living cells, the long polymers that make up a DNA molecule can be up to 109 nucleotides long. Virtually all the information required for the development of a fertilized human egg into an adult made of trillions of cells with specialized functions can be stored in the sequence of the four types of nucleotides that make up the roughly 3 × 109 base pairs in the human genome. Because of the principles of base pairing discussed in the following sections, this information is readily copied with an error rate of only about 1 in 2.5 × 108 nucleotides per generation. The exact replication of this information in any species ensures its genetic continuity from generation to generation and is critical to the normal development of individuals. DNA fulfills these functions so well that it is the vessel for genetic information in all known forms of life (excluding RNA viruses, which are limited to extremely short genomes because of the relative instability of RNA compared with DNA, as we will see). The discovery that virtually all forms of life use DNA to encode their genetic information and use a nearly identical genetic code implies that all forms of life descended from a common ancestor whose genetic information was stored in nucleic acid sequence. This information is accessed and replicated by specific base pairing between nucleotides. The information stored in DNA is arranged in hereditary units, known as genes, that control identifiable traits of an organism. In the process of transcription, the information stored in DNA is copied into ribonucleic acid (RNA), which has three distinct roles in protein synthesis, in addition to its more recently discovered functions in the regulation of chromatin structure, transcription, and protein synthesis, which we will discuss in Chapters 8, 9, and 10.

168

Portions of the DNA nucleotide sequence are copied into messenger RNA (mRNA) molecules that direct the synthesis of a specific protein. The nucleotide sequence of an mRNA molecule contains information that specifies the correct order of amino acids during the synthesis of a protein. The remarkably accurate, stepwise assembly of amino acids into proteins occurs by translation of mRNA. In this process, the nucleotide sequence of an mRNA molecule is “read” by a second type of RNA called transfer RNA (tRNA) with the aid of a third type of RNA, ribosomal RNA (rRNA), and associated proteins. As the correct amino acids are brought into sequence by tRNAs, they are linked by peptide bonds to make proteins. RNA synthesis is called transcription because the four-base sequence “language” of DNA is precisely copied, or transcribed, into the nucleotide sequence of an RNA molecule. Protein synthesis is referred to as translation because the four-base sequence “language” of DNA and RNA is translated into the twenty–amino acid sequence “language” of proteins.

Discovery of the structure of DNA in 1953 and the subsequent elucidation of how DNA directs synthesis of RNA, which then directs assembly of proteins—the so-called central dogma—were monumental achievements marking the early days of molecular biology. However, the simplified representation of the central dogma as DNA → RNA → protein does not reflect the role of proteins in the synthesis of nucleic acids. Moreover, as discussed here for bacteria and in later chapters for eukaryotes, proteins are largely responsible for regulating gene expression, the entire process whereby the information encoded in DNA is decoded into proteins in the correct cells at the correct times in development. As a consequence, hemoglobin is expressed only in cells in the bone marrow (erythroid progenitors) destined to develop into circulating red blood cells (erythrocytes), and developing neurons make the proper synapses (connections) with 1011 other developing neurons in the human brain. The fundamental molecular genetic processes of DNA replication, transcription, and translation must be carried out with extraordinary fidelity, speed, and accurate regulation for the normal development of organisms as complex as bacteria, archaea, and eukaryotes (see Figure 1-1). This is achieved by chemical processes that operate with extraordinary accuracy coupled with multiple layers of checkpoint or surveillance mechanisms that test whether critical steps in these processes have occurred correctly before the next step is initiated. The highly regulated expression of genes necessary for the development of a multicellular organism requires integration of information from signals sent by distant cells in the developing organism, as well as from neighboring cells, and an intrinsic developmental program determined by earlier steps in embryogenesis taken by each cell’s progenitors. All of this regulation is dependent on control sequences in the DNA that function with proteins called transcription factors to coordinate the expression of every gene. The RNA sequences we discuss in Chapters 8, 9, and 10 also serve to regulate chromatin structure, transcription, RNA processing, and translation. Nucleic acids function as the “brains and central nervous system” of the cell, while proteins carry out most of the functions they specify.

In this chapter, we first review the structures and properties of DNA and RNA and explore how the different characteristics of these two types of nucleic acids make them suited for their respective functions in the cell. In the next several sections, we discuss the basic processes summarized in Figure 5-1: transcription of DNA into RNA precursors, processing of these precursors to make functional RNA molecules, translation of mRNAs into proteins, and the replication of DNA. Proteins regulate cell structure and most of the biochemical reactions in cells, so we first consider how the amino acid sequences of proteins, which determine their three-dimensional structures and hence their functions, are encoded in DNA and translated. After outlining the functions of mRNA, tRNA, and rRNA in protein synthesis, we present a detailed description of the components and biochemical steps in translation. Understanding these processes gives us a deep appreciation of the need to copy the nucleotide sequence of DNA precisely. Consequently, we next consider the molecular problems involved in DNA replication and the complex cellular machinery that ensures accurate copying of the genetic material. Along the way, we compare these processes in prokaryotes and eukaryotes. The next section describes how damage to DNA is repaired and how regions of different DNA molecules are exchanged in the process of recombination to generate new combinations of traits in the individual organisms of a species. The final section of the chapter presents basic information about viruses, parasites that exploit the cellular machinery for DNA replication, transcription, and protein synthesis. In addition to being significant pathogens, viruses are important model organisms for studying these cellular mechanisms of macromolecular synthesis and other cellular processes. Viruses have relatively simple structures compared with cells, and their small genomes made them tractable for historic early studies of these fundamental cellular processes. Viruses continue to teach important lessons in molecular cell biology today and have been adapted as experimental tools for introducing genes into cells, tools that are currently being tested for their effectiveness in human gene therapy.

169

image
FIGURE 5-1 Overview of four basic molecular genetic processes. In this chapter, we cover the three processes that lead to production of proteins 13 and the process for replicating DNA 4. Because viruses utilize host-cell machinery, they have been important models for studying these processes. During transcription of a protein-coding gene by RNA polymerase 1, the four-base DNA code specifying the amino acid sequence of a protein is copied, or transcribed, into a precursor messenger RNA (pre-mRNA) by the polymerization of ribonucleoside triphosphate monomers (rNTPs). Removal of noncoding sequences and other modifications to the pre-mRNA 2, collectively known as RNA processing, produce a functional mRNA, which is transported to the cytoplasm. During translation 3, the four-base code of the mRNA is decoded into the 20–amino acid language of proteins. Ribosomes, the macromolecular machines that translate the mRNA code, are composed of two subunits assembled in the nucleolus from ribosomal RNAs (rRNAs) and multiple proteins (left). After transport to the cytoplasm, ribosomal subunits associate with an mRNA and carry out protein synthesis with the help of transfer RNAs (tRNAs) and translation factor proteins. During DNA replication 4, which occurs only in cells preparing to divide, deoxyribonucleoside triphosphate monomers (dNTPs) are polymerized to yield two identical copies of each chromosomal DNA molecule. Each daughter cell receives one of the identical copies.