20-
law of effect Thorndike’s principle that behaviors followed by favorable consequences become more likely, and that behaviors followed by unfavorable consequences become less likely.
B. F. Skinner (1904–
operant chamber in operant conditioning research, a chamber (also known as a Skinner box) containing a bar or key that an animal can manipulate to obtain a food or water reinforcer; attached devices record the animal’s rate of bar pressing or key pecking.
reinforcement in operant conditioning, any event that strengthens the behavior it follows.
257
For his pioneering studies, Skinner designed an operant chamber, popularly known as a Skinner box (FIGURE 20.2). The box has a bar (a lever) that an animal presses (or a key [a disc] the animal pecks) to release a reward of food or water. It also has a device that records these responses. This design creates a stage on which rats and other animals act out Skinner’s concept of reinforcement: any event that strengthens (increases the frequency of) a preceding response. What is reinforcing depends on the animal and the conditions. For people, it may be praise, attention, or a paycheck. For hungry and thirsty rats, food and water work well. Skinner’s experiments have done far more than teach us how to pull habits out of a rat. They have explored the precise conditions that foster efficient and enduring learning.
shaping an operant conditioning procedure in which reinforcers guide behavior toward closer and closer approximations of the desired behavior.
Imagine that you wanted to condition a hungry rat to press a bar. Like Skinner, you could tease out this action with shaping, gradually guiding the rat’s actions toward the desired behavior. First, you would watch how the animal naturally behaves, so that you could build on its existing behaviors. You might give the rat a bit of food each time it approaches the bar. Once the rat is approaching regularly, you would give the food only when it moves close to the bar, then closer still. Finally, you would require it to touch the bar to get food. With this method of successive approximations, you reward responses that are ever-
Shaping can also help us understand what nonverbal organisms perceive. Can a dog distinguish red and green? Can a baby hear the difference between lower-
Skinner noted that we continually reinforce and shape others’ everyday behaviors, though we may not mean to do so. Isaac’s whining annoys his father, for example, but look how he typically responds:
Isaac: Could you take me to the mall?
Father: (Continues reading paper.)
Isaac: Dad, I need to go to the mall.
Father: Uh, yeah, just a minute.
Isaac: DAAAAD! The mall!
Father: Show me some manners! Okay, where are my keys …
Isaac’s whining is reinforced, because he gets something desirable—
Or consider a teacher who sticks gold stars on a wall chart beside the names of children scoring 100 percent on spelling tests. As everyone can then see, some children consistently do perfect work. The others, who may have worked harder than the academic all-
258
20-
positive reinforcement increasing behaviors by presenting positive reinforcers. A positive reinforcer is any stimulus that, when presented after a response, strengthens the response.
negative reinforcement increasing behaviors by stopping or reducing negative stimuli. A negative reinforcer is any stimulus that, when removed after a response, strengthens the response. (Note: Negative reinforcement is not punishment.)
Up to now, we’ve mainly been discussing positive reinforcement, which strengthens a response by presenting a typically pleasurable stimulus after a response. But, as we saw in the whining Isaac story, there are two basic kinds of reinforcement (TABLE 20.1). Negative reinforcement strengthens a response by reducing or removing something negative. Isaac’s whining was positively reinforced, because Isaac got something desirable—
Operant Conditioning Term | Description | Examples |
---|---|---|
Positive reinforcement | Add a desirable stimulus | Pet a dog that comes when you call it; pay the person who paints your house. |
Negative reinforcement | Remove an aversive stimulus | Take painkillers to end pain; fasten seat belt to end loud beeping. |
Sometimes negative and positive reinforcement coincide. Imagine a worried student who, after goofing off and getting a bad exam grade, studies harder for the next exam. This increased effort may be negatively reinforced by reduced anxiety, and positively reinforced by a better grade. We reap the rewards of escaping the aversive stimulus, which increases the chances that we will repeat our behavior. The point to remember: Whether it works by reducing something aversive, or by providing something desirable, reinforcement is any consequence that strengthens behavior.
primary reinforcer an innately reinforcing stimulus, such as one that satisfies a biological need.
conditioned reinforcer a stimulus that gains its reinforcing power through its association with a primary reinforcer; also known as a secondary reinforcer.
259
PRIMARY AND CONDITIONED REINFORCERS Getting food when hungry or having a painful headache go away is innately satisfying. These primary reinforcers are unlearned. Conditioned reinforcers, also called secondary reinforcers, get their power through learned association with primary reinforcers. If a rat in a Skinner box learns that a light reliably signals a food delivery, the rat will work to turn on the light. The light has become a conditioned reinforcer. Our lives are filled with conditioned reinforcers—
IMMEDIATE AND DELAYED REINFORCERS Let’s return to the imaginary shaping experiment in which you were conditioning a rat to press a bar. Before performing this “wanted” behavior, the hungry rat will engage in a sequence of “unwanted” behaviors—
Unlike rats, humans do respond to delayed reinforcers: the paycheck at the end of the week, the good grade at the end of the semester, the trophy at the end of the season. Indeed, to function effectively we must learn to delay gratification. In one of psychology’s most famous studies, some 4-
To our detriment, small but immediate pleasures (the enjoyment of watching late-
20-
reinforcement schedule a pattern that defines how often a desired response will be reinforced.
continuous reinforcement schedule reinforcing the desired response every time it occurs.
In most of our examples, the desired response has been reinforced every time it occurs. But reinforcement schedules vary. With continuous reinforcement, learning occurs rapidly, which makes it the best choice for mastering a behavior. But extinction also occurs rapidly. When reinforcement stops—
partial (intermittent) reinforcement schedule reinforcing a response only part of the time; results in slower acquisition of a response but much greater resistance to extinction than does continuous reinforcement.
Real life rarely provides continuous reinforcement. Salespeople do not make a sale with every pitch. But they persist because their efforts are occasionally rewarded. This persistence is typical with partial (intermittent) reinforcement schedules, in which responses are sometimes reinforced, sometimes not. Learning is slower to appear, but resistance to extinction is greater than with continuous reinforcement. Imagine a pigeon that has learned to peck a key to obtain food. If you gradually phase out the food delivery until it occurs only rarely, in no predictable pattern, the pigeon may peck 150,000 times without a reward (Skinner, 1953). Slot machines reward gamblers in much the same way—
260
Lesson for parents: Partial reinforcement also works with children. Occasionally giving in to children’s tantrums for the sake of peace and quiet intermittently reinforces the tantrums. This is the very best procedure for making a behavior persist.
Skinner (1961) and his collaborators compared four schedules of partial reinforcement. Some are rigidly fixed, some unpredictably variable.
fixed-ratio schedule in operant conditioning, a reinforcement schedule that reinforces a response only after a specified number of responses.
Fixed-ratio schedules reinforce behavior after a set number of responses. Coffee shops may reward us with a free drink after every 10 purchased. Once conditioned, rats may be reinforced on a fixed ratio of, say, one food pellet for every 30 responses. Once conditioned, animals will pause only briefly after a reinforcer before returning to a high rate of responding (FIGURE 20.3).
variable-ratio schedule in operant conditioning, a reinforcement schedule that reinforces a response after an unpredictable number of responses.
Variable-ratio schedules provide reinforcers after a seemingly unpredictable number of responses. This unpredictable reinforcement is what slot-
“The charm of fishing is that it is the pursuit of what is elusive but attainable, a perpetual series of occasions for hope.”
Scottish author John Buchan
(1875-
fixed-interval schedule in operant conditioning, a reinforcement schedule that reinforces a response only after a specified time has elapsed.
Fixed-interval schedules reinforce the first response after a fixed time period. Animals on this type of schedule tend to respond more frequently as the anticipated time for reward draws near. People check more frequently for the mail as delivery time approaches. A hungry child jiggles the Jell-
variable-interval schedule in operant conditioning, a reinforcement schedule that reinforces a response at unpredictable time intervals.
Variable-interval schedules reinforce the first response after varying time periods. Like the longed-
261
Fixed | Variable | |
---|---|---|
Ratio | Every so many: reinforcement after every nth behavior, such as buy 10 coffees, get 1 free, or pay workers per product unit produced | After an unpredictable number: reinforcement after a random number of behaviors, as when playing slot machines or fly fishing |
Interval | Every so often: reinforcement for behavior after a fixed time, such as Tuesday discount prices | Unpredictably often: reinforcement for behavior after a random amount of time, as when checking for a Facebook response |
In general, response rates are higher when reinforcement is linked to the number of responses (a ratio schedule) rather than to time (an interval schedule). But responding is more consistent when reinforcement is unpredictable (a variable schedule) than when it is predictable (a fixed schedule). Animal behaviors differ, yet Skinner (1956) contended that the reinforcement principles of operant conditioning are universal. It matters little, he said, what response, what reinforcer, or what species you use. The effect of a given reinforcement schedule is pretty much the same: “Pigeon, rat, monkey, which is which? It doesn’t matter… . Behavior shows astonishingly similar properties.”
20-
punishment an event that tends to decrease the behavior that it follows.
Reinforcement increases a behavior; punishment does the opposite. A punisher is any consequence that decreases the frequency of a preceding behavior (TABLE 20.3). Swift and sure punishers can powerfully restrain unwanted behavior. The rat that is shocked after touching a forbidden object and the child who is burned by touching a hot stove will learn not to repeat those behaviors.
Type of Punisher | Description | Examples |
---|---|---|
Positive punishment | Administer an aversive stimulus. | Spray water on a barking dog; give a traffic ticket for speeding. |
Negative punishment | Withdraw a rewarding stimulus. | Take away a misbehaving teen’s driving privileges; revoke a library card for nonpayment of fines. |
Criminal behavior, much of it impulsive, is also influenced more by swift and sure punishers than by the threat of severe sentences (Darley & Alter, 2011). Thus, when Arizona introduced an exceptionally harsh sentence for first-
How should we interpret the punishment studies in relation to parenting practices? Many psychologists and supporters of nonviolent parenting have noted four major drawbacks of physical punishment (Gershoff, 2002; Marshall, 2002):
Punished behavior is suppressed, not forgotten. This temporary state may (negatively) reinforce parents’ punishing behavior. The child swears, the parent swats, the parent hears no more swearing and feels the punishment successfully stopped the behavior. No wonder spanking has been a hit with so many parents—
262
Punishment teaches discrimination among situations. In operant conditioning, discrimination occurs when an organism learns that certain responses, but not others, will be reinforced. Did the punishment effectively end the child’s swearing? Or did the child simply learn that while it’s not okay to swear around the house, it’s okay elsewhere?
Punishment can teach fear. In operant conditioning, generalization occurs when an organism’s response to similar stimuli is also reinforced. A punished child may associate fear not only with the undesirable behavior but also with the person who delivered the punishment or the place it occurred. Thus, children may learn to fear a punishing teacher and try to avoid school, or may become more anxious (Gershoff et al., 2010). For such reasons, most European countries and most U.S. states now ban hitting children in schools and child-
Physical punishment may increase aggression by modeling violence as a way to cope with problems. Studies find that spanked children are at increased risk for aggression (MacKenzie et al., 2013). We know, for example, that many aggressive delinquents and abusive parents come from abusive families (Straus & Gelles, 1980; Straus et al., 1997).
Some researchers have noted a problem with this logic. Well, yes, they’ve said, physically punished children may be more aggressive, for the same reason that people who have undergone psychotherapy are more likely to suffer depression—
See LaunchPad's Video: Correlational Studies for a helpful tutorial animation.
If one adjusts for preexisting antisocial behavior, then an occasional single swat or two to misbehaving 2-
The swat is used only as a backup when milder disciplinary tactics fail. (Children’s compliance often increases after a reprimand and a “time-
The swat is combined with a generous dose of reasoning and reinforcing.
Other researchers remain unconvinced. After controlling for prior misbehavior, they report that more frequent spankings of young children predict future aggressiveness (Grogan-
Parents of delinquent youths are often unaware of how to achieve desirable behaviors without screaming, hitting, or threatening their children with punishment (Patterson et al., 1982). Training programs can help transform dire threats (“You clean up your room this minute or no dinner!”) into positive incentives (“You’re welcome at the dinner table after you get your room cleaned up”). Stop and think about it. Aren’t many threats of punishment just as forceful, and perhaps more effective, when rephrased positively? Thus, “If you don’t get your homework done, there’ll be no car” would better be phrased as … .
In classrooms, too, teachers can give feedback on papers by saying, “No, but try this …” and “Yes, that’s it!” Such responses reduce unwanted behavior while reinforcing more desirable alternatives. Remember: Punishment tells you what not to do; reinforcement tells you what to do. Thus, punishment trains a particular sort of morality—
What punishment often teaches, said Skinner, is how to avoid it. Most psychologists now favor an emphasis on reinforcement: Notice people doing something right and affirm them for it.
263
Fill in the blanks below with one of the following terms: positive reinforcement (PR), negative reinforcement (NR), positive punishment (PP), and negative punishment (NP). We have provided the first answer (PR) for you.
Type of Stimulus | Give It | Take It Away |
---|---|---|
Desired (for example, a teen’s use of the car): | 1. PR |
Question2. b310BYABb7E= |
Undesired/aversive (for example, an insult): |
Question3. lo9CfE14B6Y= |
Question4. 5s1oj15zSzE= |