Chapter Introduction

Genomes and Genomics

507

Genomes

and Genomics

CHAPTER

14

LEARNING OUTCOMES

After completing this chapter, you will be able to

  • Describe the combinations of strategies typically necessary for obtaining and assembling the complete DNA sequences of organisms.

  • List the functional elements within genomes, and explain how they are identified computationally and experimentally.

  • Compare whole-genome and subgenomic approaches to personalized medicine.

  • Describe how comparative genomics is employed to reveal genetic differences between species.

  • Explain how the availability of genomic sequence enables reverse genetic analysis of gene function.

The human nuclear genome viewed as a set of labeled DNA. The DNA of each chromosome has been labeled with a dye that emits fluorescence at one specific wavelength (producing a specific color).
[Nallasivam Palanisamy, MSc., MPhil., PhD., Associate Professor of Pathology, Michigan Center for Translational Pathology, University of Michigan.]

OUTLINE

14.1 The genomics revolution

14.2 Obtaining the sequence of a genome

14.3 Bioinformatics: meaning from genomic sequence

14.4 The structure of the human genome

14.5 The comparative genomics of humans with other species

14.6 Comparative genomics and human medicine

14.7 Functional genomics and reverse genetics

508

In the summer of 2009, Dr. Alan Mayer, a pediatrician at Children’s Hospital of Wisconsin in Milwaukee, wrote to a colleague about the heartbreaking and baffling case of a four-year-old patient of his (Figure 14-1). For two years, little Nicholas Volker had endured over 100 trips to the operating room as doctors tried to manage a mysterious disease that was destroying his intestines, leaving him vulnerable to dangerous infections, severely underweight, and often unable to eat.

Neither Mayer nor any other doctors had ever seen a disease like Nicholas’s; they were unable to diagnose it, or to stem its ravages by any medical, surgical, or nutritional treatment. It was difficult to treat a disease that no one could identify. So, Dr. Mayer asked his colleague, Dr. Howard Jacob at the Medical College of Wisconsin, “if there is some way we can get his genome sequenced. There is a good chance Nicholas has a genetic defect, and it is likely to be a new disease. Furthermore, a diagnosis soon could save his life and truly showcase personalized genomic medicine.”1

Dr. Jacob knew that it would be a longshot. Finding a single mutation responsible for a disease would require sifting through thousands of variations in Nicholas’s DNA. One key decision was to narrow the search to just the exon sequences in Nicholas’s DNA. The rationale was that if the causal mutation was a protein-coding change, then it could be identified by sequencing all of the exons, or Nicholas’s exome, which comprise a little over 1 percent of the entire human genome. Still, it would be an expensive search—the sequencing would cost about $75,000 with the technology available at the time. Nevertheless, the money was raised from donors, and Jacob and a team of collaborators undertook the task.

As Jacob expected, they found more than 16,000 possible candidate variations in Nicholas’s DNA. They narrowed this long list by focusing on those mutations that had not been previously identified in humans, and that caused amino acid replacements that were not found in other species. Eventually, they identified a single base substitution in a gene called the X-linked inhibitor of apoptosis (XIAP) that changed one amino acid at position 203 of the protein—an amino acid that was invariant among mammals, fish, and even the fruit-fly counterparts of the XIAP gene.

509

Figure 14-1: Nicholas Volker
Figure 14-1: DNA sequencing of all the exons of Nicholas Volker’s genome revealed a single mutation responsible for his debilitating, but previously unidentified, disease.
[Gary Porter/MCT/Newscom.]

Fortunately, the identification of Nicholas’s XIAP mutation suggested a therapeutic approach. The XIAP gene was previously known to have a role in the inflammatory response, and mutations in the gene were associated with a very rare but potentially fatal immune disorder (although not Nicholas’s intestinal symptoms). Based on that knowledge, Nicholas’s doctors boosted his immune system with an infusion of umbilical-cord blood from a well-matched donor. Over the next several months, Nicholas’s health improved to the point where he was able to eat steak and other foods. And over the next two years, Nicholas did not require any further intestinal surgeries.

The diagnosis and treatment of Nicholas Volker illustrate the dramatic advances in the technology and impact of genomics—the study of genomes in their entirety. The long-awaited promise that genomics would shape clinical medicine is now very much a reality. The technological and biological progress from what started as a trickle of data in the 1990s has been astounding. In 1995, the 1.8-Mb (1.8-megabase) genome of the bacterium Haemophilus influenzae was the first genome of a free-living organism to be sequenced. In 1996 came the 12-Mb genome of Saccharomyces cerevisiae; in 1998, the 100-Mb genome of C. elegans; in 2000, the 180-Mb genome of Drosophila melanogaster; in 2001, the first draft of the 3000-Mb human genome; and, in 2005, the first draft of our closest living relative, the chimpanzee. These species are just a small sample. By the end of 2013, the sequences of almost 27,000 bacterial genomes, and more than 6600 eukaryotic species (including fungi, plants, and animals) had been deciphered.

It is no hyperbole to say that genomics has revolutionized how genetic analysis is performed and has opened avenues of inquiry that were not conceivable just a few years ago. Most of the genetic analyses that we have so far considered employ a forward approach to analyzing genetic and biological processes. That is, the analysis begins by first screening for mutants that affect some observable phenotype, and the characterization of these mutants eventually leads to the identification of the gene and the function of DNA, RNA, and protein sequences. In contrast, having the entire DNA sequences of an organism’s genome allows geneticists to work in both directions—forward from phenotype to gene, and in reverse from gene to phenotype. Without exception, genome sequences reveal many genes that were not detected from classical mutational analysis. Using so-called reverse genetics, geneticists can now systematically study the roles of such formerly unidentified genes. Moreover, a lack of prior classical genetic study is no longer an impediment to the genetic investigation of organisms. The frontiers of experimental analysis are growing far beyond the bounds of the very modest number of long-explored model organisms.

Analyses of whole genomes now contribute to every corner of biological research. In human genetics, genomics is providing new ways to locate genes that contribute to many genetic diseases, like Nicholas’s, that had previously eluded investigators. The day is soon approaching when a person’s genome sequence is a standard part of his or her medical record. The availability of genome sequences for long-studied model organisms and their relatives has dramatically accelerated gene identification, the analysis of gene function, and the characterization of non-coding elements of the genome. New technologies for the global, genome-wide analysis of the physiological role of all gene products are driving the development of the new field called systems biology. From an evolutionary perspective, genomics provides a detailed view of how genomes and organisms have diverged and adapted over geological time.

The DNA sequence of the genome is the starting point for a whole new set of analyses aimed at understanding the structure, function, and evolution of the genome and its components. In this chapter, we will focus on three major aspects of genomic analysis:

510