Chapter 1. Working With Data 35.11

Working with Data: HOW DO WE KNOW? Fig. 35.11

Figure 35.11 describes experiments conducted to measure the resting membrane potential of a squid giant axon and changes in the membrane potential when the cell was stimulated to fire an action potential. Answer the questions after the figure to practice interpreting data and understanding experimental design. These questions refer to concepts that are explained in the following three brief data analysis primers from a set of four available on LaunchPad.

  • Experimental Design
  • Data and Data Presentation
  • Statistics

You can find these primers by clicking on “Experiments and Data Analysis” in your LaunchPad menu. Within the following questions, click on “Primer Section” to read the relevant section from these primers. Click on the button labeled “Key Terms” to see pop-up definitions of boldfaced terms.

Question

RFl41j6af1mFrop+p7OSbfWNdqlPsD+muE8OCdJco8ncXafvuUAe9Ey8XETcasqZma+wCTLIqvJ+D9d3X5LwsSZLtGw1mgO2DLNsFigSH4FZvMZvPF1St6YUpeY1OR1bmL/r15HUhpv0QipaIbeYLLest8cfQw/KfSfiOEQ6QDQrh4AwLczIA+R+pODFz5VjjmdgwRLDcVVMbLnu5r9Ul+t63jnqbnEXwgttXbKgQfg7f7lA6d8+EzBz3yLSi663hcueSBiir21AiyE5P8UqFiNbwnhkMtSh5MzKw/MKQbdQ2UFuuNGjdDETldNabou/qTuZQ06DlkqfBVz7W4BYWq/AtC4itzxzxAAhytinhdt3nTr6sw2wBYN9Vl3MyfBwaAizPZj8q6h7OSlVfTbvp3AL7TwyygmjYTEEphszesJ0QUnaCklJiYeiBKEQI8Sa1SpvJCIu44Fiwu0klHNbMRDtHu3NBCdTPplC57ju/WXJuFnrUnn5aJ98updQ8h6C9MMN0Ujs/8MrhZkqEBUzUGkgyXQCAd/3o+UULiFPup70+owzvvX+2zX/VJP4SRUGjKv6LhsLnUiBNALlUqknsy9n7FbW85O7n7VcKcMmVMhZQIvqc+jOXvKTd7F5uMbFtZvhDrhmSYmt9feUUZH6Og==
Correct.
Incorrect.
Incorrect. Please try again.
1

hypothesis A tentative explanation for one or more observations that makes predictions that can be tested by experiments or additional observations.
Table

Experimental Design

Experiments provide one way to make sense of the world. There are many different kinds of experiments, some of which begin with observations. Charles Darwin began with all kinds of observations—the relationship between living organisms and fossils, the distribution of organisms on the Earth, species found on islands and nowhere else—and inferred an evolutionary process to explain what he saw. Other experiments begin with data collection. For example, genome studies begin by collecting vast amounts of data—the sequence of nucleotides in all of the DNA of an organism—and then ask questions about the patterns that are found.

Such observations can lead to questions – Why are organisms adapted to their environment? Why are there so many endemic species (organisms found in one place and nowhere else) on islands? Why does the human genome contain vast stretches of DNA that do not code for protein?

Types of Hypotheses

A hypothesis, as we saw in Chapter 1, is a tentative answer to the question, an expectation of what the results might be. This might at first seem counterintuitive. Science, after all, is supposed to be unbiased, so why should you expect any particular result at all? The answer is that it helps to organize the experimental setup and interpretation of the data.

Let’s consider a simple example. We design a new medicine and hypothesize that it can be used to treat headaches. This hypothesis is not just a hunch—it is based on previous observations or experiments. For example, we might observe that the chemical structure of the medicine is similar to other drugs that we already know are used to treat headaches. If we went into the experiment with no expectation at all, it would be unclear what to measure.

A hypothesis is considered tentative because we don’t know what the answer is. The answer has to wait until we conduct the experiment and look at the data. When an experiment predicts a specific effect, as in the case of the new medicine, it is typical to also state a null hypothesis, which predicts no effect. Hypotheses are never proven, but it is possible based on statistical analysis to reject a hypothesis. When a null hypothesis is rejected, the hypothesis gains support.

Sometimes, we formulate several alternative hypotheses to answer a single question. This may be the case when researchers consider different explanations of their data. Let’s say for example that we discover a protein that represses the expression of a gene. Our question might be: How does the protein repress the expression of the gene? In this case, we might come up with several models—the protein might block transcription, it might block translation, or it might interfere with the function of the protein product of the gene. Each of these models is an alternative hypothesis, one or more of which might be correct.

Question

O+Yd7aHmqLl7xa1WqJ0IJU2Db+IRWkKTSYNt7+qK4GY3LRY80ZHwMQW3t38TyRINQ/GKgs3enD5u1tHgJlO9+/MSBJIu5GheUGLvxO+W1h+u9X9uIKutOhI5WeGEXV4sm7mwf9/0SKjDzEy1Jvw8qxrZJpnhsG5OCwQgi+GSrQssT+rhmMDwQsKqGmDIHpcMC+CW96LM5MIndUahGTjE0LTru6Z3Miqg41OXRce3pBJPpof8IG75XEunzCdApuAycB5H1AZzJ+POBtAB/Bd4B5Rp5u7TkVRw40PD+Yo7n4+aFmFpTYyrhTsypaRsXIvS4b8gYSeLVNHOQxQNsOf7eghvIV093juKwRA16oVsRdYj/WYpO4T1IzYrVaP6fvXNJtCnW7HZ1jrIxez7w0GrEJEz4AO3t2fQhVyDzFhDTXJrEbUzLxMQxpagxdv9IrAVdO29UW2Y6DsZtZtIGCofPnpRnrUXj8t78isr7lr1pQyUREZCAo29UzDTJmBiawQbwMqk9YQXL1ZbcHArP6+VNy2qogVHok6kwpMtdbTECVl7bFw4SjBfdM7KesinTWU+OVJrdu09Ss8bvEus+Bgue3OoGY8uu0TLlon4UllfEIZP6/JV2r25rV4Q1GwZJ2vGUYRTyUMf6K3323dOTUZ2DvjRtAyzsoXwKK830NI6maFtM1WeSMParEOqS5ejZ4+uBYxV9C70f/sfotabawu8bKRlkiMmVGU584BWV2FTvShO+XpLEVc/ZO7ayMvj2vqZHUnBD5ohZwvk8wSQJD1PC85f6XGNi5z13Hsfu8fa5ovk32xJ4sly/I73pVw=
Correct.
Incorrect.
Incorrect. Please try again.
1

Question

+rV6ePJivhQvTNpi2CQaYLtrQscDpmzEbYCD4iNTSua9s4PhGLOSd1IpC2TkNdh6VPqy88YYPn71eCnjWDXBM+gkKumkE2gnj8g7+UGFpecBN6zNJ765gdjLtvy32m23ahEa9EXjEZEpP9yZNbFUgFJLqy5EHWR1of+t+0YTHXBjKPe83S6y0VHhVlaepOyOIymLKxqMEHYyl5qJKkqSUkvDjRols2DLNfE6WNYpMVRhTnteRFnoKD729fX+rHi/82qSrEwBZQxPcQzhYzObde+3R33v+v+zT2ns8X07KFQwz42r6lX6mpsvbgonjbTUnt7BwCw9mMCS8aNBoXv5imOdUsxb+R/1URQYCdVDCNWof6rJF6TF/HHNYS/Umz2TkiRaIPupmoj41nCkwHkLlfbXEvwmAN8BMyAAwqo7jfwBrj79EnRyhytyB+Azxhjc1s6FzijN+07RgxLbg5w4FWPYPr65ct48LiKSc9so8nN5dsQ6FwPefHro47ycIdi1G45HBShsj7zhIjZvjiE1mcL+ZIowan89MwwhWQlWdmARrT4x8fXida/RTWJNqy4HGEVzDcI1PDasVU2s6HwHDVkHK9LVyIwbrBGhcwM8tVdpfRoXucTsjB4FJSvvJVlT8G5/buCVYORZS+I6UH4KWnArOGsGjHAjgqs79PFEYu1Q1sfYae1+6GEBSgFhlePZ6c4JdTIIMl0uxlMC4jrRrZnCZZP22f510svmagUEFTJC2mka/dWQKWNQleTPKdCukqlwnejrRQU3DmAM5fUd3TE8mQ3rzYbIaEzzrk8D7pglVqhItp/4LmKpS/LjQ2ewDYfhU/jAwgPkDSle/A1IYQtotz9KCdXtICyubFbhsIiU7FUgy9hDilnERDfKMFJBLonepwTAxfSX3U1EoOYWhjRTgWpgg58xRsNQe9me1I0xWf/ZjosIVlXBa+cck6q0FTAqMFIV9vxdWX5tXMXKnu6My/pdCjq6S9s/1M0F3ao6CSsQlH5yXkgPUvYj711maqT9QTkooqBspKnEmrJlsgoOe62QhcGgp23gOAQopRbpx0/xTIqvea2flFmMOa+aLVgCLZuiZ0PtK/0KvKVe7PdEE1HHb4bU
Correct.
Incorrect.
Incorrect. Please try again.
1

Question

evZvA4+HfX9d9oM+rF8kOAjdIOMLsJpLWYxIfMc/9nTw6WvDolrkUg9chibxGf2dQHGRd5OfHyxhoJi5pA2bSlylVck5UYhR9MrSy/XktfwVq1AqbpBiJ0xrJESrY20l3zuUxYtWcH0ZvdgFrbGElhYt/wYXzMrGagcPpXo3yUqzI4mFcoRcNYBGKxewsF2Jq1HRL+90wTobQD7pG/iGjX/9ObhLHLg2A/g0tGProG1J7EmLdlE1hB6d6cb58La2c+DgUKNYYPKNs60U6X15NAYOmBWR0YsaQjSyMKsU2dHtLr9OVEBVgem2lrMoH8xGQkgOAQ==
Correct.
Incorrect.
Incorrect. Please try again.
1

sample The set of objects or events chosen from a larger population from which observations are taken. Typically, sampling is done randomly to ensure that the sample is representative of the total population of objects or events.
Table

Data and Data Presentation

Collecting Data

We can collect data as part of an experiment. For example, we vary the nutrients added to cells in culture to see which are critical for cell proliferation, and count the number of cells after a specified period in each of the different experimental treatments. Or data collection may be exploratory. For example, if we are interested in what mammal species are present in a remote patch of forest, we can simply record what we see as we walk through the forest. The tools needed for data collection vary correspondingly, ranging from expensive, sophisticated scientific hardware to a notebook and a pencil.

Nowadays computers often collect data automatically, meaning that it is possible to accumulate vast quantities of data. With our ability to sequence DNA cheaply and efficiently, genomic data—long strings of A’s, G’s, C’s, and T’s—is an example of the current explosion of mega-datasets. Satellite imagery, as well, supplies a vast reservoir of data about our planet.

Almost always, data represent a sample. We assume that the cells in our experiment are representative of the appropriate class of cells in general, and we assume that the animals we saw in the forest patch are representative of all the animals present in the forest. With this in mind, we have to be careful in designing our method of collecting data. Imagine, for example, that in determining what mammal species live in our patch of forest, we only visited the forest during daylight hours. Any claim to have assessed the forest for all of its inhabitants then is inaccurate because we have overlooked nocturnal species.

Data sometimes need to be weeded. A freak result in one experiment, for example, might have been caused by contamination and should be removed from the analysis because the result is produced by factors unrelated to what we are investigating. Imagine, for example, that we encounter a domestic cat belonging to a local resident in our forest mammal inventory. Given that we are interested in the native mammals, we should exclude this intruder from our data. Data weeding, however, is a tricky area. It is important only to exclude data that are clearly problematic rather than simply eliminating the data that seem to contradict our hypothesis!

Question

sfWTR7Ytj0tJOBMT2wKift8jin9drg1uyGoCAs7kYJIeTNM2noect28tyFTG7V9593Jr4yvn88ohkOyMLnwSK/zSeifqJOAb+kJQAEpTltbVoH32z3px4XU8C7fO2eM95RcjzZFCPrbK6u+ewGSrT7ejSZQzOYvEowR8IFLxmvkb8JfXUci9ghyFxYVBTGsj4PPGUmWwX6pFsiV6Z19A1dzjNpj+n9AmY3scnC0MtimuUmuLSWaYAN3TFqALje1AurD6H4Z1qFEm6QJmTjMNBfZdzmlj8peqgSELcPcmVig4a45LraQOyKugG32DeIfL91lNx+47w6mTE2RfU3UV2eA1RIZc9t9/bfHa+BbC8QYo7BIbefh3swol47W6r+fS4asuh80pSerOk/YhFe9QIVwNjFp6Z1Zcca1eDmdFw1nEdHTwW58OiiGPluoB2Vrc0gWCUYFYZz8z1R7IjMw//xdkREGCIWh4sNfyT30fgv7ZlsNv
Correct.
Incorrect.
Incorrect. Please try again.
1

mean The arithmetic average of all the measurements (all the measurements added together and the result divided by the number of measurements); the peak of a normal distribution along the x-axis.
Table

Statistics

The Normal Distribution

Figure 1

The first step in statistical analysis of data is usually to prepare some visual representation. In the case of height, this is easily done by grouping nearby heights together and plotting the result as a histogram like that shown in Figure 1. The smooth, bell-shaped curve approximating the histogram in Figure 1A is called the normal distribution. If you measured the height of more and more individuals, then you could make the width of each bar in the histogram narrower and narrower, and the shape of the histogram would gradually get closer and closer to the normal distribution.

The normal distribution does not arise by accident but is a consequence of a fundamental principle of statistics which states that when many independent factors act together to determine the magnitude of a trait, the resulting distribution of the trait is normal. Human height is one such trait because it results from the cumulative effect of many different genetic factors as well as environmental effects such as diet and exercise. The cumulative effect of the many independent factors affecting height results in a normal distribution.

The normal distribution appears in countless applications in biology. Its shape is completely determined by two quantities. One is the mean, which tells you the location of the peak of the distribution along the x-axis (Figure 2). While we do not know the mean of the population as a whole, we do know the mean of the sample, which is equal to the arithmetic average of all the measurements—the value of all of the measurements added together and divided by the number of measurements.

Figure 2

In symbols, suppose we sample n individuals and let xi be the value of the ith measurement, where i can take on the values 1, 2, ..., n. Then the mean of the sample (often symbolized ) is given by , where the symbol means “sum” and means x1 + x2 + ... + xn.

For a normal distribution, the mean coincides with another quantity called the median. The median is the value along the x-axis that divides the distribution exactly in two—half the measurements are smaller than the median, and half are larger than the median. The mean of a normal distribution coincides with yet another quantity called the mode. The mode is the value most frequently observed among all the measurements.

The second quantity that characterizes a normal distribution is its standard deviation (“s” in Figure 2), which measures the extent to which most of the measurements are clustered near the mean. A smaller standard deviation means a tighter clustering of the measurements around the mean. The true standard deviation of the entire population is unknown, but we can estimate it from the sample as

What this complicated-looking formula means is that we calculate the difference between each individual measurement and the mean, square the difference, add these squares across the entire sample, divide by n - 1, and take the square root of the result. The division by n - 1 (rather than n) may seem mysterious; however, it has the intuitive explanation that it prevents anyone from trying to estimate a standard deviation based on a single measurement (because in that case n - 1 = 0).

In a normal distribution, approximately 68% of the observations lie within one standard deviation on either side of the mean (Figure 2, light blue), and approximately 95% of the observations lie within two standard deviations on either side of the mean (Figure 2, light and darker blue together). You may recall political polls of likely voters that refer to the margin of error; this is the term that pollsters use for two times the standard deviation. It is the margin within which the pollster can state with 95% confidence the true percentage of likely voters favoring each candidate at the time the poll was conducted.

Figure 3

For reasons rooted in the history of statistics, the standard deviation is often stated in terms of s2 rather than s. The square of the standard deviation is called the variance of the distribution. Both the standard deviation and the variance are measures of how closely most data points are clustered around the mean. Not only is the standard deviation more easily interpreted than the variance (Figure 2), but also it is more intuitive in that standard deviation is expressed in the same units as the mean (for example, in the case of height, inches), whereas the variance is expressed in the square of the units (for example, inches2). On the other hand, the variance is the measure of dispersal around the mean that more often arises in statistical theory and the derivation of formulas. Figure 3 shows how increasing variance of a normal distribution corresponds to greater variation of individual values from the mean. Since all of the distributions in Figure 3 are normal, 68% of the values lie within one standard deviation of the mean, and 95% within two standard deviations of the mean.

Another measure of how much the numerical values in a sample are scattered is the range. As its name implies, the range is the difference between the largest and the smallest values in the sample. The range is a less widely used measure of scatter than the standard deviation.

Question

dhX/2NAnUD9mwCGZjofb0Jet8mc3lq8CWEd5DpcQ/mIKap+EewPCJI0Tvp8s3M34r8HnjbrX3ZdLHNw8FehSO2MshjS8nsCkYv2ojW2KA1uiZWEYGiojrHR4rErh4vj94r5IrDwMbOiKT1Kn0w+Ym6p6PgRkMg6+704f6hDVnyfQCgwRH3oMYLuQJESRiRwIszdqxITz8m8TZ3NefvqIaQvdsL5mnw8Q4hyMFx0+bgrFGRBm6nm4FsQUNAYyJSQTbHNqig9q8Vd++297b9HVO0ZINmbQoDSk2KlKOlfBvOXwk77TH742+Wk9Xn9iU9+cx6MBdRZNiUfTIWh18dMgMq+dRoLMO4wiVqEL+vofpLOoQbFqfdnO59dPdumA7UyWJsEp0rAEdkk0FMkdQTWZSQKggHMJuaKHdDn7hhxCZUJ+IUyjr7wMPUbVNH6DjKMFRyusds6k9LZj5oaxy8m27LN0ayiipPCNcDFc3qDeNxrT6QgE94m0uJHQ9CvWfy+qSP2Mb0th2pr5yux2JNb8TAYi8r/QBNlXZ9UtCD4f1HHZwcFIgd8jVYxBweboEH+NgLYs/oiyHJDnwFZ/cpI5UTwvfUGUNIUk0f1Vx0dAr/Z9TEpa++t9KIYU5BqEjTtTLoA4R5iwIaOszSWvXIIokjvV1ytGJaQuDPDMAQUn+lmbopYqgDBLiEuZbpbX7tbmytHNZEGKzij1BoXbpLlCeS1FLd62/9ph3fc7Wuxyows+VqlV
Correct.
Incorrect.
Incorrect. Please try again.
1

Question

AiwV615GU4liz183gC0f2jepdO0zDcFHAbzxPjRBILOMGZ7O2ZqtEAmXg3FLJC1J8lMge/kZCDXlCuiWmFHIh6FyC19oaiMXlNawjmYmIjWMe0FxsDJPH9yCr7ZJbUIVnSJG64tQwhZTpWAlj3S9/SG2E2DdYrTjd9jXAGy2QNz0C4mZHkLdGR6st4vKAlr7EejC4oQsAz4/ODaMxbhkl7z/eKoUxrm6XnREwO+hvxELxS8FfZ2ZQ4hwQZnpDGMkDQoVhcQj0xLFRHMcEM+iz84bFAppWdIzU6bSUvm1cDFvZrObYpNojkqPOcCTCJu4
Correct.
Incorrect.
Incorrect. Please try again.
1

Question

e9T9eKFqQJ4JlKVGpCilxmRXcIfLGBY0K6fOm7CUI0zoMYx1Jo6dIIW3jlblVoFFI1x8+Dt3ykwehDG3KhQ3q7LxzU9rp+8lprxLuajjkZOrJ3pEWanDXaezn18aL8Zsu6GKnZPo6KmtumZt7af1UhAQOP5CzXBWhFhJ7Dyulx3sIug9mXCI9/aO/8mUU4PaVyMTDpUMRhcrH8MiwPHp2O2Uu9HisixM2+NUUGIGp3JPSO+QmG4qAxx8BZGzw+eHEXqC0fwiVoxxoqeiPYaOuyv0VErHXO+le+cjfbE8bLQgls9Q+Ntd3T4yKZB1zGF3NXNcioGe6FsdC3781p9rd2bb3ZwcjXw0hFFmgs8LjTy8DqmIzQazUTAvxAucD60rn2hU0dQqJiSMKhG56nEYSNJO7tKCvH/EVurgQU9C+tgBcsdUGy3oLFzKqUxXU9Cery4yVNBl+aQp2y3vIvhydpCalCASaA2v4YXW5PRZG0ZpGB/pqr3TkyWbGc2Fxjfn84mBsG0YZUhatJxiLw0KL4nNBMoiAZ7SFJgHcg2FAgtJCguLP6brIi9t/BCSRCu/nZz2nZ3BdvYBMBcZtOlv1f7fkhXfcR/Io1y8LPpf1+K8TbXHfQIgiJojCuR2/AAVzh9e8PcJ4Z5v+L4NpmxPQRrszsW4FKB4ke86Dm4D2bYFnC38Ht253g/mgJN2CA7HKs6B8+NnOZ40pvJuopoZaey1xqE=
Correct.
Incorrect.
Incorrect. Please try again.
1

Question

/yPjeAsAg+BOOmlt/O889Cfnu60U0Ec2uoYp8fvO/wz5WvUTdXC680QvZV4REWZMrf9vxT4fFCyVi8hsPZ7pK5qlyTsAnyLogNkh/lt6JvYGNdY4vWVg9s8eH+QHKGqLMIWNas45OrqdZ9IEYmYTBLNLyfoayihO1FfgUJC10+CtkNGcbDZfJGcTR6erbfEHi9Ohm6Kry0Md+w/hsiTXXxqm8vyVApkZ6eNonza0Lp5sfr+mJ10ICn1NBgW/jHaxApzSpgLUZ/Z/xoA1/0NEPTmxfz6uanW+MN27hOEJUPu3HtmeAZsJwPrmGe9n+g51uFQgbZmGXcMm3vg4DPlUmd+EJKPk3kqbM3tq28o6PcGsv+LwId0emZqz/sMWxAhSaVsfDAXs1LjmaUhT9+GaQ0ZlrAeZHHN50SbS7Dxn2To5SHKgx8dXnR3XE1FsdoiC8ymUn0iSGhtFNKxoOS42sLryCws4ioYJHaBw7/g9CLMhLmswd8AzzWl8ncl99R9F1fTE9vHbwqMENWoHI9ngtOI9w48rLPIKgDVb8L8B5+mAn4+iijd8isoAdWK3v+fGEVxDDKddQszEBE6UlTUtfoBbHHn2wKn1gw/7j/SMkWGXpyazTAKz6ceQ/w1QlBM0QCf25O3Iv4/IiuyBpgeJ80sugklT+rFgRsUYYLHJ9Utp3uegssgXTHMHBkDL2NwJ6s5MQTr/eNVuHd7iUo0rabV7uo6qmJdeWA3JW5/2f5Z/xZtCHiU584cMmzjJ6YeUL5I1vNciv7NXaRmjoYfk07+xDS426ROP0VwWHIECJpNLvxis39ycQuZJffBX8Qr8ohPZNzdmjs027tM5p9ZrW/iylsnIvsqXfCFTK0Mv6A4Mj5sIppJhr8lLcfYUNzR6cUS19KDc1ITyiJlYFXX49C41W42tYIU1
Correct.
Incorrect.
Incorrect. Please try again.
1