Most new functions arise following gene duplication

Gene duplication is yet another way in which genomes can acquire new functions. When a gene is duplicated, one copy of that gene is potentially freed from having to perform its original function. The initially identical copies of a duplicated gene can have any one of four subsequent fates:

  1. Both copies of the gene may retain their original function (which can result in a change in the amount of gene product that is produced by the organism).

  2. Both copies of the gene may retain the ability to produce the original gene product, but the expression of the genes may diverge in different tissues or at different times in development.

  3. One copy of the gene may be incapacitated by the accumulation of deleterious substitutions and become a nonfunctional pseudogene, or may be eliminated from the genome altogether.

  4. One copy of the gene may retain its original function while the second copy accumulates enough substitutions that it can perform a different function.

How often do *gene duplications arise, and which of these four outcomes is most likely? Investigators have found that rates of gene duplication are fast enough for a yeast or Drosophila population to acquire several hundred duplicate genes over the course of a million years. They have also found that most of the duplicated genes in these organisms are very young. Many extra genes are lost from a genome within 10 million years (which is rapid on an evolutionary time scale).

498

*connect the concepts Key Concept 19.1 describes shared developmental mechanisms controlled by specific DNA sequences that have been modified and reshuffled to produce the remarkable diversity of plants, animals, and other organisms we know today. Similarity in the homeobox sequence common to the Hox genes suggests that the Hox genes arose through duplication of an ancestral gene, which then diverged to take on new functions.

Some genes may be duplicated many times, resulting in large numbers of related pseudogenes scattered throughout the genome. In the human genome, the functional copy of the ribosomal protein gene RPL21 is located on chromosome pair 13, but pseudogenes derived from it are found on most of the other chromosome pairs (Figure 23.9). Although not all genes are represented by pseudogenes, there are nearly as many known pseudogenes in the human genome as there are functional protein-coding genes.

image
Figure 23.9 Some Functional Genes Are Duplicated Many Times as Nonfunctional Pseudogenes (A) The functional gene that encodes ribosomal protein RPL21 is located on human chromosome 13 (indicated in orange). In addition, there are many nonfunctional pseudogenes of RPL21 in the human genome, produced through repeated duplication events (indicated in blue). (B) Although RPL21 represents a relatively extreme example of pseudogene duplication, there are almost as many known pseudogenes in the human genome as there are functional genes.

Although many extra genes disappear rapidly, some duplication events lead to the evolution of genes with new functions. Several successive rounds of duplication and mutation may result in a gene family: a group of homologous genes with related functions, often arrayed in tandem along a chromosome. An example of this process is provided by the globin gene family (see Figure 17.9). The globins were among the first proteins to be sequenced and compared. Comparisons of their amino acid sequences strongly suggest that the different globins arose via gene duplications. These comparisons also allow us to estimate how long the globins have been evolving separately, because differences among these proteins have accumulated with time.

Hemoglobin, a tetramer (four-subunit molecule) consisting of two α-globin and two β-globin polypeptide chains, carries oxygen in blood. Myoglobin, a monomer, is the primary O2 storage protein in muscle. Myoglobin’s affinity for O2 is much higher than that of hemoglobin, but hemoglobin has evolved to be more diversified in its roles. Hemoglobin binds O2 in the lungs or gills, where the O2 concentration is relatively high, transports it to deep body tissues, where the O2 concentration is low, and releases it in those tissues. With its more complex tetrameric structure, hemoglobin is able to carry four molecules of O2, as well as hydrogen ions and carbon dioxide, in the blood.

To estimate the time of the globin gene duplication that gave rise to the α- and β-globin gene clusters, we can create a gene tree—a phylogenetic tree that describes the evolutionary history of particular genes or gene families, in this case the gene sequences that encode the various globins (Figure 23.10). The rate of molecular evolution of globin genes has been estimated from other studies, using the divergence times of groups of vertebrates that are well documented in the fossil record. These studies indicate an average rate of divergence for globin genes of about 1 nucleotide substitution every 2 million years. By applying this rate to the globin gene tree, we can estimate the divergence time of the two globin gene clusters at about 450 million years ago.

image
Figure 23.10 A Globin Family Gene Tree A molecular clock analysis suggests that the α-globin (blue) and β-globin (green) gene clusters diverged about 450 million years ago, soon after the origin of the vertebrates.

Question

Q: When did the gene duplication event that gave rise to the delta and beta chains occur?

Approximately 100 million years ago.

Activity 23.3 Gene Tree Construction

www.life11e.com/ac23.3

Many gene duplications affect only one or a few genes at a time, but entire genomes are duplicated in polyploid organisms (which include many plants). When all the genes are duplicated, there are massive opportunities for new functions to evolve. That is exactly what appears to have happened in the evolution of vertebrates. The genomes of the jawed vertebrates appear to have four diploid sets of many major genes, which led biologists to conclude that two genome-wide duplication events occurred in the ancestor of these species. These duplications have allowed considerable specialization of individual vertebrate genes, many of which are now highly tissue-specific in their expression.