SUMMARY

Genomic analysis takes the approaches of genetic analysis and applies them to the collection of global data sets to fulfill goals such as the mapping and sequencing of whole genomes and the characterization of all transcripts and proteins. Genomic techniques require the rapid processing of large sets of experimental material, all dependent on extensive automation.

The key problem in compiling an accurate sequence of a genome is to take short sequence reads and relate them to one another by sequence identity to build up a consensus sequence of an entire genome. This can be done straightforwardly for bacterial or archaeal genomes by aligning overlapping sequences from different sequence reads to compile the entire genome, because there are few or no DNA segments that are present in more than one copy in such organisms. The problem is that complex genomes of plants and animals are replete with such repetitive sequences. These repetitive sequences interfere with accurate sequence-contig production. The problem is resolved in whole-genome shotgun (WGS) sequencing with the use of paired-end reads.

Having a genomic sequence map provides the raw, encrypted text of the genome. The job of bioinformatics is to interpret this encrypted information. For the analysis of gene products, computational techniques are used to identify ORFs and noncoding RNAs, then to integrate these results with available experimental evidence for transcript structures (cDNA sequences), protein similarities, and knowledge of characteristic sequence motifs.

One of the most powerful means to advance the analysis and annotation of genomes is by comparing with the genomes of related species. Conservation of sequences among species is a reliable guide to identifying functional sequences in the complex genomes of many animals and plants. Comparative genomics can also reveal how genomes have changed in the course of evolution and how these changes may relate to differences in physiology, anatomy, or behavior among species. Comparisons of human genomes are accelerating the discovery of rare disease mutations. In bacterial genomics, comparisons of pathogenic and nonpathogenic strains have revealed many differences in gene content that contribute to pathogenicity.

Functional genomics attempts to understand the working of the genome as a whole system. Two key elements are the transcriptome, the set of all transcripts produced, and the interactome, the set of interacting gene products and other molecules that together enable a cell to be produced and to function. The function of individual genes and gene products for which classical mutations are not available can be tested through reverse genetics—by targeted mutation or phenocopying.