Fig. 24.8 describes Rebecca Cann’s studies using mitochondrial DNA to map relationships among different populations of humans. Answer the questions after the figure to practice interpreting data and understanding experimental design. Some of these questions refer to concepts that are explained in the following two brief data analysis primers from a set of four available on LaunchPad:
You can find these primers by clicking on the button labeled “Resources” in the menu at the upper right on your main LaunchPad page. Within the following questions, click on “Primer Section” to read the relevant section from these primers. Click on “Key Terms” to see pop-up definitions of boldfaced terms.
“Found Adrift!” scream the headlines. Aboard a primitive raft in the South Indian Ocean, an extraordinary new life form is found. It’s definitely humanlike, but where does it fit on the human phylogeny?
Dubbed “Mystery,” we add its DNA to our data table in Fig 24.8:
Where should it lie on the phylogeny?
Imagine, in an alternative universe, that our species originated in North America (the Native American population), rather than Africa. Given the same set of samples as in Fig 24.8, what is the most likely phylogeny?
Here is some mitochondrial DNA sequence data from a study of a representative individual from each of several different human groups and a chimpanzee. Seven hundred base pairs were sequenced in total. Only the nucleotide sites that vary among the groups are represented in the table (for example, positions 1–17 are identical for the chimpanzee and all the human groups).
DNA position | 18 | 38 | 112 | 118 | 156 | 231 | 244 | 274 | 319 | 487 | 493 | 512 | 533 | 619 |
Chimp | T | G | G | G | A | C | G | C | G | C | T | A | G | C |
Group a | A | A | A | G | A | C | T | T | G | T | T | A | T | C |
Group b | A | G | G | T | C | C | T | C | G | T | C | G | T | C |
Group c | A | G | A | G | A | G | T | T | G | T | T | A | T | T |
Group d | A | G | G | G | C | C | T | C | A | T | T | G | T | C |
Which of the phylogenies below best represents the relationships among chimps and Groups a through d, based on the data in the table?
Mitochondrial DNA (mtDNA) is maternally inherited and Y-chromosome DNA is paternally inherited. In a new study, we sample multiple populations around the world for both Y-chromosome DNA and mtDNA and, through DNA sequencing, assess how genetically similar they are to each other using a measure of genetic difference in which 0 signifies genetically identical populations and 1 represents completely genetically distinct populations. We plot our data and generate regression lines as follows:
regression coefficient | The slope of a straight line (such as a regression line) relating values of y to those of x. |
Statistics
Correlation and Regression
Biologists often are also interested in the relation between two different measurements, such as height and weight or number of species on an island versus the size of the island. Such data are often depicted as a scatter plot (Figure 5), in which the magnitude of one variable is plotted along the x-axis and the other along the y-axis, each point representing one paired observation.
Figure 5A is the sort of data that would correspond to fingerprint ridge count (the number of raised skin ridges lying between two reference points in each fingerprint). While the data show some scatter, the overall trend is evident. There is a very strong association between the average fingerprint ridge count of parents and that of their offspring. The strength of association between two variables can be measured by the correlation coefficient, which theoretically ranges between +1 and –1. A correlation coefficient of +1 means a perfect positive relation (as one variable increases, the other increases proportionally), and a correlation coefficient of –1 implies a perfect negative relation (as one variable increases, the other decreases proportionally). Correlation coefficients of +1 or –1 are rarely observed in real data. In the case of fingerprint ridge count, the correlation coefficient is 0.9, which implies that the average fingerprint ridge count of offspring is almost (but not quite) equal to that of the parents. For a complex trait, this is a remarkably strong correlation.
Figure 5B represents data that would correspond to adult height. The data exhibit greater scatter than in Figure 5A; however, there is still a fairly strong resemblance between parents and offspring. The correlation coefficient in this case is 0.5. This value means that, on average, the offspring height is approximately halfway between that of the average of the parents and the average of the population as a whole.
The illustrations in Figure 5A and 5B also emphasize one limitation of the correlation coefficient. The correlation coefficient measures the strength of a straight-line (linear) relation. A nonlinear relation (one curving upward or downward) between two variables could be quite strong, but the data might still show a weak correlation.
Each of the straight lines in Figure 5 is a regression line or, more precisely, a regression line of y onx. Each line depicts how, on average, the variable y changes as a function of the variable x across the whole set of data. The slope of the line tells you how many units y changes, on average, for a unit change in x. A slope of +1 implies that a one-unit change in x results in a one-unit change in y, and a slope of 0 implies that the value of x has no effect on the value of y. The slope of a straight line relating values of y to those of x is known as the regression coefficient.