One of the great early puzzles of vision, about which philosophers speculated for centuries, was that of depth perception. We automatically, effortlessly, see the world in three dimensions. Objects occupy and move in space that includes not only a vertical (up-down) and a horizontal (right-left) dimension but also a dimension of depth, or distance from our eyes. Our retinas and the images that are projected on them are two-dimensional. It is relatively easy to understand how our retinas might record the vertical and horizontal dimensions of our visual world, but how do they record the third dimension, that of depth?
How did Helmholtz describe perception as a problem-solving process?
A major step toward answering this question was the publication of a treatise on vision by Hermann von Helmholtz (1867/1962), the same German physiologist who developed the trichromatic theory of color vision. He argued that seeing is an active mental process. The light focused onto our retinas is not the scene we see but is simply a source of hints about the scene. Our brain infers the characteristics and positions of objects from cues in the reflected light, and those inferences are our perceptions. Helmholtz pointed out that the steps in this inferential process can be expressed mathematically, in equations relating information in the reflected light to conclusions about the positions, sizes, and shapes of objects in the visual scene. We are not conscious of these calculations and inferences; our brain works them out quickly and automatically, without our awareness.
307
Depth perception works best when you use both eyes. You can prove that with a simple demonstration. Pick up two pencils and hold them in front of you, one in each hand, with their points toward each other. Close one eye and move the pencils toward each other to make the points touch at their tips. Chances are, you will miss by a little bit on your first attempt, and your subsequent adjustments will have something of a trial-and-error quality. Now repeat the task with both eyes open. Is it easier? Do you find that now you can see more clearly which adjustments to make to bring the points together and that you no longer need trial and error? You can see depth reasonably well with one eye alone, but you can see it better with both eyes together.
Binocular Cues for Depth
How does binocular disparity serve as a cue for depth?
The most important binocular (two-eye) cue for depth perception is binocular disparity, which refers to the slightly different (disparate) views that the two eyes have of the same object or scene. Because the eyes are a few centimeters apart, they view any given object from slightly different angles. To see how the degree of binocular disparity varies depending on an object’s distance from your eyes, hold your index finger about a foot in front of your eyes with a wall in the background. Look at your finger with just your right eye open, then with just your left eye open, alternately back and forth. As you alternate between the two eyes, your finger appears to jump back and forth with respect to the background wall. That is because each eye views the finger from a different angle. Now move your finger farther away (out to full arm’s length), and notice that your view of it jumps a smaller distance with respect to the wall as you again alternate between right-eye and left-eye views. The farther your finger is from your eyes, the smaller is the difference in the angle between each eye and the finger; the two lines of sight become increasingly parallel.
Thus, the degree of disparity between the two eyes’ views can serve as a cue to judge an object’s distance from the eyes: The less the disparity, the greater the distance. In normal, binocular vision your brain fuses the two eyes’ views to give a perception of depth. Helmholtz (1867/1962) showed mathematically how the difference in two objects’ distance from the eyes can be calculated from differences in the degree of binocular disparity. More recently, researchers have found that neurons in an area of the visual cortex close to the primary visual area respond best to stimuli that are presented to both eyes at slightly disparate locations on the retina (Thomas et al., 2002). These neurons appear to be ideally designed to permit depth perception. For another demonstration of binocular disparity, see Figure 8.26.
Illusions of Depth Created by Binocular Disparity
How do stereoscopes provide an illusion of depth?
The ability to see depth from binocular disparity—an ability called stereopsis—was first demonstrated in the early nineteenth century by Charles Wheatstone (described by Helmholtz, 1867/1962). Wheatstone wondered what would happen if he drew two slightly different pictures of the same object or scene, one as seen by the left eye and one as seen by the right, and then viewed them simultaneously, each with the appropriate eye. To permit such viewing, he invented a device called a stereoscope. The effect was dramatic. When viewed through the stereoscope, the two pictures were fused perceptually into a single image containing depth.
308
Stereoscopes became a great fad in the late nineteenth century as they enabled people to see scenes such as Buckingham Palace or the Grand Canyon in full depth by placing cards that contained two photographs of the same scene, shot simultaneously from slightly different angles, into their stereoscope. The Viewmaster, a once-popular child’s toy, is a modern example of a stereoscope. Three-dimensional motion pictures and comic books employ the same general principle. In the simplest versions, each frame of the film or comic strip contains an overlapping pair of similar images, each in a different color, one slightly displaced from the other, and the viewer wears colored glasses that allow only one image to enter each eye. You can demonstrate stereopsis without any special viewer by looking at the two patterns in Figure 8.27 in the manner described in the caption.
Monocular Cues for Depth
Although depth perception is most vivid with two eyes, it is by no means absent with one. People who have just one functioning eye can drive cars, shoot basketballs, and reach out and pick objects up without fumbling around.
How does motion parallax serve as a cue for depth, and how is it similar to binocular disparity?
Motion Parallax Among the monocular (one-eye) cues for depth, perhaps the most valuable is motion parallax, which refers to the changed view one has of a scene or object when one’s head moves sideways. To demonstrate motion parallax, hold your finger up in front of your face and view it with one eye as you rock your head back and forth. As your head moves, you gain different views of the finger, and you see it being displaced back and forth with respect to the wall in the background. If you now move your finger farther away from your eye, the same head movement produces a less-changed view. Thus, the degree of change in either eye’s view at one moment compared with the next, as the head moves in space, can serve as a cue for assessing the object’s distance from the eyes: The smaller the change, the greater the distance.
309
As you can infer from this demonstration, motion parallax is very similar to binocular disparity. In fact, binocular disparity is sometimes called binocular parallax. The word parallax refers to the apparent change in an object or scene that occurs when it is viewed from a new vantage point. In motion parallax, the changed vantage point comes from the movement of the head, and in binocular parallax (or disparity), it comes from the separation of the two eyes.
What are some cues for depth that exist in pictures as well as in the actual, three-dimensional world?
Pictorial Cues Motion parallax depends on the geometry of true three-dimensionality and cannot be used to depict depth in two-dimensional pictures. All the remaining monocular depth cues, however, can provide a sense of depth in pictures as well as in the real three-dimensional world, and thus they are called pictorial cues for depth. You can identify some of these by examining Figure 8.28 and considering all the reasons why you see some objects in the scene as standing in the foreground and others as more distant. The pictorial cues include the following.
Why does size perception depend on distance perception?
The ability to judge the size of an object is intimately tied to the ability to judge its distance. As Figure 8.30 illustrates, the size of the retinal image of an object is inversely proportional to the object’s distance from the retina. Thus, if an object is moved twice as far away, it produces a retinal image half the height and width of the one it produced before. You don’t see the object as smaller, though, just farther away. The ability to see an object as unchanged in size, despite change in the image size as it moves farther away or closer, is called size constancy. For familiar objects, such as a pencil or a car, previous knowledge of the object’s usual size may contribute to size constancy. But size constancy also occurs for unfamiliar objects if cues for distance are available, and even familiar objects can appear to be drastically altered in size if misleading distance cues are present (for an example, see Figure 8.31).
311
How might the unconscious assessment of depth provide a basis for the Ponzo, Müller-Lyer, and moon illusions?
It is not difficult to produce drawings in which two identical lines or objects appear to be different in size. Two classic examples are the Ponzo illusion (first described by Mario Ponzo in 1913) and the Müller-Lyer illusion (first described by F. C. Müller-Lyer in the mid-nineteenth century), both illustrated in Figure 8.32. In each illusion two horizontal bars appear to be different in length; but if you measure them, you will discover that they are in fact identical.
312
Richard Gregory (1968) offered a depth-processing theory to account for these and various other size illusions. This theory—consistent with everything said so far about the relation between size and distance—maintains that one object in each illusion appears larger than the other because of distance cues that, at some early stage of perceptual processing, lead it to be judged as farther away. If one object is judged to be farther away than the other but the two produce the same-size retinal image, then the object judged as farther away will be judged as larger. This theory applies most readily to the Ponzo illusion, in which the two converging lines provide the depth cue of linear perspective, causing (according to the theory) the upper bar to be judged as farther away, and hence larger, than the lower one. The photograph in Figure 8.33 makes this point clear.
The application of the depth-processing theory to the Müller-Lyer illusion is a bit more subtle. The assumption here is that people register the figures as three-dimensional objects, something like sawhorses viewed from above. The object with wings extending outward (top drawing in Figure 8.34) resembles an upside-down sawhorse, with legs toward the viewer, and the one with inward wings (bottom drawing) resembles a right-side-up sawhorse, with its horizontal bar closer to the observer. If real sawhorses were viewed this way, the horizontal bar of the upside-down one would be farther from the observer than that of the right-side-up one, and if it produced the same-size retinal image, it would in fact be longer.
The Moon Illusion
The moon illusion has provoked debate since ancient Greek and Roman times. You have probably noticed that the moon looks huge when it is near the earth’s horizon, just above the trees or buildings in the distance, but looks much smaller when it is closer to the zenith (directly overhead). This difference is truly an illusion. Objectively, the moon is the same size, and the same distance from us, whether it is at the horizon or the zenith. If you view the horizon moon through a peephole so that you see it in isolation from other objects such as trees and buildings, the illusion disappears and the moon looks no larger than it does at the zenith.
A depth-processing account of this illusion was first proposed by the Greek astronomer Ptolemy in the second century, was revived by Helmholtz (1867/1962) in the nineteenth century, and has been supported in modern times through research conducted by Lloyd Kaufman and his colleagues (Kaufman et al., 2007; Kaufman & Rock, 1962). The account can be summarized as follows: Our visual system did not evolve to judge such huge distances as that from the earth to the moon, so we automatically assess its distance in relation to more familiar earthly objects. Most objects that we see near the earth’s horizon are farther away than objects that we see farther from the horizon (as noted earlier, in the discussion of pictorial cues for depth). For example, birds or clouds seen near the horizon are usually farther away than are those seen higher up, closer to the zenith. Thus, our perceptual system assumes that the moon is farther away at the horizon than at the zenith, even though in reality it is the same distance away from us in either position. As in the case of the Ponzo and Müller-Lyer illusions, when two objects produce the same-size retinal image and are judged to be different distances away, the one that is judged to be farther away is seen as larger than the other.
313
Even today, the main objection to this explanation of the moon illusion is that people do not consciously see the horizon moon as farther away than the zenith moon (Hershenson, 1989, 2003). When people see the large-appearing horizon moon and are asked whether it seems farther away or closer than usual, they usually say closer. Again, however, as with the Ponzo and Müller-Lyer illusions, Kaufman and Irving Rock (1989) contend that we must distinguish between unconscious and conscious assessments. From their perspective, the sequence of perceptual assessments about the horizon moon might be described as follows:
Not all perceptual psychologists agree with the depth-processing account of the moon illusion, or with that of the Ponzo or Müller-Lyer illusions, but that account seems to be supported by more evidence and logic than any other explanations that have been offered to date (Hershenson, 1989; Kaufman & Kaufman, 2000; Ross & Plug, 2002).
Although perceptual psychologists have made great strides since the days of Ptolemy, we are still a long way from a full account of the calculations that our brains make to infer the sizes, distances, and shapes of all the objects in our field of view.
We see three-dimensionally—that is, with depth—even though the retina is two-dimensional.
Depth-Perception Cues
Size Perception
314