The primary structure of a protein is its amino acid sequence

The precise sequence of amino acids in a polypeptide chain held together by peptide bonds constitutes the primary structure of a protein (see Figure 3.7A). The backbone of the polypeptide chain consists of the repeating sequence —N—C—C— made up of the N atom from the amino group, the α C atom, and the C atom from the carboxyl group in each amino acid.

The single-letter abbreviations for amino acids (see Table 3.2) are used to record the amino acid sequence of a protein. Here, for example, are the first 20 amino acids (out of a total of 124) in the protein ribonuclease from a cow:

KETAAAKFERQHMDSSTSAA

The theoretical number of different proteins is enormous. Since there are 20 different amino acids, there could be 20 × 20 = 400 distinct dipeptides (two linked amino acids) and 20 × 20 × 20 = 8,000 different tripeptides (three linked amino acids). Imagine this process of multiplying by 20 extended to a protein made up of 100 amino acids (which would be considered a small protein). There could be 20100 (that’s approximately 10130) such small proteins, each with its own distinctive primary structure. How large is the number 20100? Physicists tell us there aren’t that many electrons in the entire universe!

The sequence of amino acids in the polypeptide chain determines its final shape. The properties associated with each functional group in the side chains of the amino acids (see Table 3.2) determine how the protein can twist and fold, thus adopting a specific stable structure that distinguishes it from every other protein.