The classic method for determining the amino acid sequence of a protein is Edman degradation. In this procedure, the free amino group of the N-
Before about 1985, biologists commonly used Edman degradation for determining protein sequences. Now, however, complete protein sequences usually are determined primarily by analysis of genome and messenger RNA sequences. The complete genomes of many organisms have already been sequenced, and the database of genome sequences from humans and numerous model organisms is expanding rapidly. As discussed in Chapter 6, the sequences of proteins can be deduced from DNA sequences that are predicted to encode proteins.
A powerful approach for determining the primary structure of an isolated protein combines MS and the use of sequence databases. First, the peptide mass fingerprint of the protein is obtained by MS. A peptide mass fingerprint is the list of the molecular weights of peptides that are generated from the protein by digestion with a specific protease, such as trypsin. The molecular weights of the parent protein and its proteolytic fragments are then used to search genome databases for any similar-