5.1 The Proteome Is the Functional Representation of the Genome

Every year, the genomes of more organisms are being elucidated, revealing the exact DNA base sequences and the number of genes encoded. For example, researchers concluded that the roundworm Caenorhabditis elegans has a genome of 97 million bases and about 19,000 protein-encoding genes, whereas that of the fruit fly Drosophila melanogaster contains 180 million bases and about 14,000 protein-encoding genes. The completely sequenced human genome contains 3 billion bases and about 23,000 protein-encoding genes. But this genomic knowledge is analogous to a list of parts for a car: it does not explain which parts are present in different components or how the parts work together. A new word, the proteome, has been coined to signify a more complex level of information content—the level of functional information, which encompasses the types, functions, and interactions of proteins that yield a functional unit.

The term proteome is derived from proteins expressed by the genome. The genome provides a list of gene products that could be present, but only a subset of these gene products will actually be expressed in a given biological context. The proteome tells us what proteins are functionally present. Unlike the genome, the proteome is not a fixed characteristic of the cell. Rather, because it represents the functional expression of information, it varies with cell type, developmental stage, and environmental conditions, such as the presence of hormones. Moreover, proteins can be enzymatically modified in a variety of ways. Furthermore, these proteins do not exist in isolation; they often interact with one another to form complexes with specific functional properties.

An understanding of the proteome is acquired by isolating, characterizing, and cataloging proteins. In some, but not all, cases, this process begins by separating a particular protein from all other biomolecules in the cell.