4.3 Protein Evolution and the Origin of New Proteins

86

The amino acid sequences of more than a million proteins are known, and the particular three-dimensional structure has been determined for each of more than 10,000 proteins. While few of the sequences and structures are identical, many are sufficiently similar that the proteins can be grouped into about 25,000 protein families. A protein family is a group of structurally and functionally related proteins as a result of shared evolutionary history.

Why are there not more types of proteins? The number of possible sequences is unimaginably large. For example, for a polypeptide of only 62 amino acids, there are 2062 possible sequences (because each of the 62 positions could be occupied by any of the 20 amino acids). The number 2062 equals approximately 1080; this number is also the estimated total number of electrons, protons, and neutrons in the entire universe! So why are there so few protein families? The most likely answer is that the chance that any random sequence of amino acids would fold into a stable configuration and carry out some useful function in the cell is very close to zero.