Learning

7.2 Operant Conditioning

7-7 What is operant conditioning?

It’s one thing to classically condition a dog to salivate at the sound of a tone, or a child to fear moving cars. To teach an elephant to walk on its hind legs or a child to say please, we turn to operant conditioning.

Classical conditioning and operant conditioning are both forms of associative learning, yet their differences are straightforward:

operant conditioning a type of learning in which behavior is strengthened if followed by a reinforcer or diminished if followed by a punisher.

Classical conditioning forms associations between stimuli (a CS and the US it signals). It also involves respondent behavior—automatic responses to a stimulus (such as salivating in response to meat powder, and later in response to a tone).
In operant conditioning, organisms associate their own actions with consequences. Actions followed by reinforcers increase; those followed by punishers often decrease. Behavior that operates on the environment to produce rewarding or punishing stimuli is called operant behavior.

RETRIEVE IT

Question

With classical conditioning, we learn associations between events we (do/do not) control. With operant conditioning, we learn associations between our behavior and (resulting/random) events.

Skinner’s Experiments

7-8 Who was Skinner, and how is operant behavior reinforced and shaped?

law of effect Thorndike’s principle that behaviors followed by favorable consequences become more likely, and that behaviors followed by unfavorable consequences become less likely.

B. F. Skinner (1904–1990) was a college English major and aspiring writer who, seeking a new direction, studied psychology in graduate school. He went on to become modern behaviorism’s most influential and controversial figure. Skinner’s work elaborated on what psychologist Edward L. Thorndike (1874–1949) called the law of effect: Rewarded behavior tends to recur (FIGURE 7.9). Using Thorndike’s law of effect as a starting point, Skinner developed a behavioral technology that revealed principles of behavior control. By shaping pigeons’ natural walking and pecking behaviors, for example, Skinner was able to teach pigeons such unpigeon-like behaviors as walking in a figure 8, playing Ping-Pong, and keeping a missile on course by pecking at a screen target.

FIGURE 7.9 Cat in a puzzle box Thorndike used a fish reward to entice cats to find their way out of a puzzle box (left) through a series of maneuvers. The cats’ performance tended to improve with successive trials (right), illustrating Thorndike’s law of effect. (Data from Thorndike, 1898.)

operant chamber in operant conditioning research, a chamber (also known as a Skinner box) containing a bar or key that an animal can manipulate to obtain a food or water reinforcer; attached devices record the animal’s rate of bar pressing or key pecking.

reinforcement in operant conditioning, any event that strengthens the behavior it follows.

Page 257

For his pioneering studies, Skinner designed an operant chamber, popularly known as a Skinner box (FIGURE 7.10). The box has a bar (a lever) that an animal presses (or a key [a disc] the animal pecks) to release a reward of food or water. It also has a device that records these responses. This design creates a stage on which rats and other animals act out Skinner’s concept of reinforcement: any event that strengthens (increases the frequency of) a preceding response. What is reinforcing depends on the animal and the conditions. For people, it may be praise, attention, or a paycheck. For hungry and thirsty rats, food and water work well. Skinner’s experiments have done far more than teach us how to pull habits out of a rat. They have explored the precise conditions that foster efficient and enduring learning.

Figure 7.9: FIGURE 7.10 A Skinner box Inside the box, the rat presses a bar for a food reward. Outside, measuring devices (not shown here) record the animal’s accumulated responses.

Shaping Behavior

shaping an operant conditioning procedure in which reinforcers guide behavior toward closer and closer approximations of the desired behavior.

Imagine that you wanted to condition a hungry rat to press a bar. Like Skinner, you could tease out this action with shaping, gradually guiding the rat’s actions toward the desired behavior. First, you would watch how the animal naturally behaves, so that you could build on its existing behaviors. You might give the rat a bit of food each time it approaches the bar. Once the rat is approaching regularly, you would give the food only when it moves close to the bar, then closer still. Finally, you would require it to touch the bar to get food. With this method of successive approximations, you reward responses that are ever-closer to the final desired behavior, and you ignore all other responses. By making rewards contingent on desired behaviors, researchers and animal trainers gradually shape complex behaviors.

Reinforcers vary with circumstances

What is reinforcing (a heat lamp) to one animal (a cold meerkat) may not be to another (an overheated child). What is reinforcing in one situation (a cold snap at the Taronga Zoo in Sydney) may not be in another (a sweltering summer day).

Will Burgess/Reuters/Landov

Shaping can also help us understand what nonverbal organisms perceive. Can a dog distinguish red and green? Can a baby hear the difference between lower- and higher-pitched tones? If we can shape them to respond to one stimulus and not to another, then we know they can perceive the difference. Such experiments have even shown that some animals can form concepts. When experimenters reinforced pigeons for pecking after seeing a human face, but not after seeing other images, the pigeons’ behavior showed that they could recognize human faces (Herrnstein & Loveland, 1964). In this experiment, the human face was a discriminative stimulus. Like a green traffic light, discriminative stimuli signal that a response will be reinforced. After being trained to discriminate among classes of events or objects—flowers, people, cars, chairs—pigeons were usually able to identify the category in which a new pictured object belonged (Bhatt et al., 1988; Wasserman, 1993). They have even been trained to discriminate between the music of Bach and Stravinsky (Porter & Neuringer, 1984).

Skinner noted that we continually reinforce and shape others’ everyday behaviors, though we may not mean to do so. Isaac’s whining annoys his father, for example, but look how he typically responds:

Isaac: Could you take me to the mall?

Father: (Continues reading paper.)

Isaac: Dad, I need to go to the mall.

Father: Uh, yeah, just a minute.

Isaac: DAAAAD! The mall!

Father: Show me some manners! Okay, where are my keys …

Isaac’s whining is reinforced, because he gets something desirable—his dad’s attention. Dad’s response is reinforced because it gets rid of something aversive—Isaac’s whining.

Or consider a teacher who sticks gold stars on a wall chart beside the names of children scoring 100 percent on spelling tests. As everyone can then see, some children consistently do perfect work. The others, who may have worked harder than the academic all-stars, get no rewards. The teacher would be better advised to apply the principles of operant conditioning—to reinforce all spellers for gradual improvements (successive approximations toward perfect spelling of words they find challenging).

Types of Reinforcers

Page 258

7-9 How do positive and negative reinforcement differ, and what are the basic types of reinforcers?

positive reinforcement increasing behaviors by presenting positive reinforcers. A positive reinforcer is any stimulus that, when presented after a response, strengthens the response.

negative reinforcement increasing behaviors by stopping or reducing negative stimuli. A negative reinforcer is any stimulus that, when removed after a response, strengthens the response. (Note: Negative reinforcement is not punishment.)

Up to now, we’ve mainly been discussing positive reinforcement, which strengthens a response by presenting a typically pleasurable stimulus after a response. But, as we saw in the whining Isaac story, there are two basic kinds of reinforcement (TABLE 7.1). Negative reinforcement strengthens a response by reducing or removing something negative. Isaac’s whining was positively reinforced, because Isaac got something desirable—his father’s attention. His dad’s response to the whining (doing what Isaac wanted) was negatively reinforced, because it ended an aversive event—Isaac’s whining. Similarly, taking aspirin may relieve your headache, and hitting snooze will silence your annoying alarm. These welcome results provide negative reinforcement and increase the odds that you will repeat these behaviors. For those with drug addiction, the negative reinforcement of ending withdrawal pangs can be a compelling reason to resume using (Baker et al., 2004). Note that negative reinforcement is not punishment. (Some friendly advice: Repeat the italicized words in your mind.) Rather, negative reinforcement—psychology’s most misunderstood concept—removes a punishing (aversive) event. Think of negative reinforcement as something that provides relief—from that whining teen, bad headache, or annoying alarm.

Table 7.1: TABLE 7.1
Ways to Increase Behavior

Operant Conditioning Term	Description	Examples
Positive reinforcement	Add a desirable stimulus	Pet a dog that comes when you call it; pay the person who paints your house.
Negative reinforcement	Remove an aversive stimulus	Take painkillers to end pain; fasten seat belt to end loud beeping.

Sometimes negative and positive reinforcement coincide. Imagine a worried student who, after goofing off and getting a bad exam grade, studies harder for the next exam. This increased effort may be negatively reinforced by reduced anxiety, and positively reinforced by a better grade. We reap the rewards of escaping the aversive stimulus, which increases the chances that we will repeat our behavior. The point to remember: Whether it works by reducing something aversive, or by providing something desirable, reinforcement is any consequence that strengthens behavior.

RETRIEVE IT

Question

How is operant conditioning at work in this cartoon?

ANSWER: The baby negatively reinforces her parents' behavior when she stops crying once they grant her wish. Her parents positively reinforce her cries by letting her sleep with them.

primary reinforcer an innately reinforcing stimulus, such as one that satisfies a biological need.

conditioned reinforcer a stimulus that gains its reinforcing power through its association with a primary reinforcer; also known as a secondary reinforcer.

Page 259

PRIMARY AND CONDITIONED REINFORCERS Getting food when hungry or having a painful headache go away is innately satisfying. These primary reinforcers are unlearned. Conditioned reinforcers, also called secondary reinforcers, get their power through learned association with primary reinforcers. If a rat in a Skinner box learns that a light reliably signals a food delivery, the rat will work to turn on the light. The light has become a conditioned reinforcer. Our lives are filled with conditioned reinforcers—money, good grades, a pleasant tone of voice—each of which has been linked with more basic rewards.

IMMEDIATE AND DELAYED REINFORCERS Let’s return to the imaginary shaping experiment in which you were conditioning a rat to press a bar. Before performing this “wanted” behavior, the hungry rat will engage in a sequence of “unwanted” behaviors—scratching, sniffing, and moving around. If you present food immediately after any one of these behaviors, the rat will likely repeat that rewarded behavior. But what if the rat presses the bar while you are distracted, and you delay giving the reinforcer? If the delay lasts longer than about 30 seconds, the rat will not learn to press the bar. It will have moved on to other incidental behaviors, such as scratching, sniffing, and moving, and one of these behaviors will instead get reinforced.

Unlike rats, humans do respond to delayed reinforcers: the paycheck at the end of the week, the good grade at the end of the semester, the trophy at the end of the season. Indeed, to function effectively we must learn to delay gratification. In one of psychology’s most famous studies, some 4-year-olds showed this ability. In choosing a candy or marshmallow, they preferred having a big one tomorrow to munching on a small one right away. Learning to control our impulses in order to achieve more valued rewards is a big step toward maturity (Logue, 1998a, b). No wonder children who delay gratification have tended to become socially competent and high-achieving adults (Mischel, 2014).

To our detriment, small but immediate pleasures (the enjoyment of watching late-night TV, for example) are sometimes more alluring than big but delayed rewards (resting, then feeling alert tomorrow). For many teens, the immediate gratification of risky, unprotected sex in passionate moments prevails over the delayed gratifications of safe sex or saved sex. And for many people, the immediate rewards of today’s gas-guzzling vehicles, air travel, and air conditioning prevail over the bigger future consequences of global climate change, rising seas, and extreme weather.

Reinforcement Schedules

7-10 How do different reinforcement schedules affect behavior?

reinforcement schedule a pattern that defines how often a desired response will be reinforced.

continuous reinforcement schedule reinforcing the desired response every time it occurs.

In most of our examples, the desired response has been reinforced every time it occurs. But reinforcement schedules vary. With continuous reinforcement, learning occurs rapidly, which makes it the best choice for mastering a behavior. But extinction also occurs rapidly. When reinforcement stops—when we stop delivering food after the rat presses the bar—the behavior soon stops. It extinguishes. If a normally dependable candy machine fails to deliver a chocolate bar twice in a row, we stop putting money into it (although a week later we may exhibit spontaneous recovery by trying again).

partial (intermittent) reinforcement schedule reinforcing a response only part of the time; results in slower acquisition of a response but much greater resistance to extinction than does continuous reinforcement.

Real life rarely provides continuous reinforcement. Salespeople do not make a sale with every pitch. But they persist because their efforts are occasionally rewarded. This persistence is typical with partial (intermittent) reinforcement schedules, in which responses are sometimes reinforced, sometimes not. Learning is slower to appear, but resistance to extinction is greater than with continuous reinforcement. Imagine a pigeon that has learned to peck a key to obtain food. If you gradually phase out the food delivery until it occurs only rarely, in no predictable pattern, the pigeon may peck 150,000 times without a reward (Skinner, 1953). Slot machines reward gamblers in much the same way—occasionally and unpredictably. And like pigeons, slot players keep trying, time and time again. With intermittent reinforcement, hope springs eternal.

Page 260

Lesson for parents: Partial reinforcement also works with children. Occasionally giving in to children’s tantrums for the sake of peace and quiet intermittently reinforces the tantrums. This is the very best procedure for making a behavior persist.

Skinner (1961) and his collaborators compared four schedules of partial reinforcement. Some are rigidly fixed, some unpredictably variable.

fixed-ratio schedule in operant conditioning, a reinforcement schedule that reinforces a response only after a specified number of responses.

Fixed-ratio schedules reinforce behavior after a set number of responses. Coffee shops may reward us with a free drink after every 10 purchased. Once conditioned, rats may be reinforced on a fixed ratio of, say, one food pellet for every 30 responses. Once conditioned, animals will pause only briefly after a reinforcer before returning to a high rate of responding (FIGURE 7.11).

variable-ratio schedule in operant conditioning, a reinforcement schedule that reinforces a response after an unpredictable number of responses.

Variable-ratio schedules provide reinforcers after a seemingly unpredictable number of responses. This unpredictable reinforcement is what slot-machine players and fly fishers experience, and it’s what makes gambling and fly fishing so hard to extinguish even when they don’t produce the desired results. Because reinforcers increase as the number of responses increases, variable-ratio schedules produce high rates of responding.

“The charm of fishing is that it is the pursuit of what is elusive but attainable, a perpetual series of occasions for hope.”

Scottish author John Buchan
(1875-1940)

fixed-interval schedule in operant conditioning, a reinforcement schedule that reinforces a response only after a specified time has elapsed.

Fixed-interval schedules reinforce the first response after a fixed time period. Animals on this type of schedule tend to respond more frequently as the anticipated time for reward draws near. People check more frequently for the mail as delivery time approaches. A hungry child jiggles the Jell-O more often to see if it has set. Pigeons peck keys more rapidly as the time for reinforcement draws nearer (see FIGURE 7.11).

Figure 7.10: FIGURE 7.11 Intermittent reinforcement schedules Skinner’s (1961) laboratory pigeons produced these response patterns to each of four reinforcement schedules. (Reinforcers are indicated by diagonal marks.) For people, as for pigeons, reinforcement linked to number of responses (a ratio schedule) produces a higher response rate than reinforcement linked to amount of time elapsed (an interval schedule). But the predictability of the reward also matters. An unpredictable (variable) schedule produces more consistent responding than does a predictable (fixed) schedule.

variable-interval schedule in operant conditioning, a reinforcement schedule that reinforces a response at unpredictable time intervals.

Variable-interval schedules reinforce the first response after varying time periods. Like the longed-for message that finally rewards persistence in rechecking e-mail or Facebook, variable-interval schedules tend to produce slow, steady responding. This makes sense, because there is no knowing when the waiting will be over (TABLE 7.2).

Page 261

Table 7.2: TABLE 7.2
Schedules of Partial Reinforcement

	Fixed	Variable
Ratio	Every so many: reinforcement after every nth behavior, such as buy 10 coffees, get 1 free, or pay workers per product unit produced	After an unpredictable number: reinforcement after a random number of behaviors, as when playing slot machines or fly fishing
Interval	Every so often: reinforcement for behavior after a fixed time, such as Tuesday discount prices	Unpredictably often: reinforcement for behavior after a random amount of time, as when checking for a Facebook response

Vitaly Titov & Maria Sidelnikova/Shutterstock

In general, response rates are higher when reinforcement is linked to the number of responses (a ratio schedule) rather than to time (an interval schedule). But responding is more consistent when reinforcement is unpredictable (a variable schedule) than when it is predictable (a fixed schedule). Animal behaviors differ, yet Skinner (1956) contended that the reinforcement principles of operant conditioning are universal. It matters little, he said, what response, what reinforcer, or what species you use. The effect of a given reinforcement schedule is pretty much the same: “Pigeon, rat, monkey, which is which? It doesn’t matter… . Behavior shows astonishingly similar properties.”

RETRIEVE IT

Question

People who send spam are reinforced by which schedule? Home bakers checking the oven to see if the cookies are done are on which schedule? Airline frequent-flyer programs that offer a free flight after a certain number of miles of travel are using which reinforcement schedule?

ANSWERS: Spammers are reinforced on a variable-ratio schedule (after a varying number of messages). Cookie checkers are reinforced on a fixed-interval schedule. Frequent-flyer programs use a fixed-ratio schedule.

Punishment

7-11 How does punishment differ from negative reinforcement, and how does punishment affect behavior?

punishment an event that tends to decrease the behavior that it follows.

Reinforcement increases a behavior; punishment does the opposite. A punisher is any consequence that decreases the frequency of a preceding behavior (TABLE 7.3). Swift and sure punishers can powerfully restrain unwanted behavior. The rat that is shocked after touching a forbidden object and the child who is burned by touching a hot stove will learn not to repeat those behaviors.

Table 7.3: TABLE 7.3
Ways to Decrease Behavior

Type of Punisher	Description	Examples
Positive punishment	Administer an aversive stimulus.	Spray water on a barking dog; give a traffic ticket for speeding.
Negative punishment	Withdraw a rewarding stimulus.	Take away a misbehaving teen’s driving privileges; revoke a library card for nonpayment of fines.

Criminal behavior, much of it impulsive, is also influenced more by swift and sure punishers than by the threat of severe sentences (Darley & Alter, 2011). Thus, when Arizona introduced an exceptionally harsh sentence for first-time drunk drivers, the drunk-driving rate changed very little. But when Kansas City police started patrolling a high crime area to increase the swiftness and sureness of punishment, that city’s crime rate dropped dramatically.

How should we interpret the punishment studies in relation to parenting practices? Many psychologists and supporters of nonviolent parenting have noted four major drawbacks of physical punishment (Gershoff, 2002; Marshall, 2002):

Punished behavior is suppressed, not forgotten. This temporary state may (negatively) reinforce parents’ punishing behavior. The child swears, the parent swats, the parent hears no more swearing and feels the punishment successfully stopped the behavior. No wonder spanking has been a hit with so many parents—with 70 percent of American adults agreeing that sometimes children need a “good, hard spanking” (Child Trends, 2013).
Page 262

Punishment teaches discrimination among situations. In operant conditioning, discrimination occurs when an organism learns that certain responses, but not others, will be reinforced. Did the punishment effectively end the child’s swearing? Or did the child simply learn that while it’s not okay to swear around the house, it’s okay elsewhere?
Punishment can teach fear. In operant conditioning, generalization occurs when an organism’s response to similar stimuli is also reinforced. A punished child may associate fear not only with the undesirable behavior but also with the person who delivered the punishment or the place it occurred. Thus, children may learn to fear a punishing teacher and try to avoid school, or may become more anxious (Gershoff et al., 2010). For such reasons, most European countries and most U.S. states now ban hitting children in schools and child-care institutions (www.endcorporalpunishment.org). As of 2015, 47 countries further outlaw hitting by parents, providing children the same legal protection given to adults.
Physical punishment may increase aggression by modeling violence as a way to cope with problems. Studies find that spanked children are at increased risk for aggression (MacKenzie et al., 2013). We know, for example, that many aggressive delinquents and abusive parents come from abusive families (Straus & Gelles, 1980; Straus et al., 1997).

Some researchers have noted a problem with this logic. Well, yes, they’ve said, physically punished children may be more aggressive, for the same reason that people who have undergone psychotherapy are more likely to suffer depression—because they had preexisting problems that triggered the treatments (Ferguson, 2013; Larzelere, 2000, 2004). Which is the chicken and which is the egg? Correlations don’t hand us an answer.

See LaunchPad's Video: Correlational Studies for a helpful tutorial animation.

If one adjusts for preexisting antisocial behavior, then an occasional single swat or two to misbehaving 2- to 6-year-olds looks more effective (Baumrind et al., 2002; Larzelere & Kuhn, 2005). That is especially so if two other conditions are met:

The swat is used only as a backup when milder disciplinary tactics fail. (Children’s compliance often increases after a reprimand and a “time-out” punishment [Owen et al., 2012].)
The swat is combined with a generous dose of reasoning and reinforcing.

Other researchers remain unconvinced. After controlling for prior misbehavior, they report that more frequent spankings of young children predict future aggressiveness (Grogan-Kaylor, 2004; Taylor et al., 2010).

Parents of delinquent youths are often unaware of how to achieve desirable behaviors without screaming, hitting, or threatening their children with punishment (Patterson et al., 1982). Training programs can help transform dire threats (“You clean up your room this minute or no dinner!”) into positive incentives (“You’re welcome at the dinner table after you get your room cleaned up”). Stop and think about it. Aren’t many threats of punishment just as forceful, and perhaps more effective, when rephrased positively? Thus, “If you don’t get your homework done, there’ll be no car” would better be phrased as … .

In classrooms, too, teachers can give feedback on papers by saying, “No, but try this …” and “Yes, that’s it!” Such responses reduce unwanted behavior while reinforcing more desirable alternatives. Remember: Punishment tells you what not to do; reinforcement tells you what to do. Thus, punishment trains a particular sort of morality—one focused on prohibition (what not to do) rather than positive obligations (Sheikh & Janoff-Bultman, 2013).

What punishment often teaches, said Skinner, is how to avoid it. Most psychologists now favor an emphasis on reinforcement: Notice people doing something right and affirm them for it.

Page 263

RETRIEVE IT

Fill in the blanks below with one of the following terms: positive reinforcement (PR), negative reinforcement (NR), positive punishment (PP), and negative punishment (NP). We have provided the first answer (PR) for you.

Type of Stimulus	Give It	Take It Away
Desired (for example, a teen’s use of the car):	1. PR	Question 2.
Undesired/aversive (for example, an insult):	Question 3.	Question 4.

Skinner’s Legacy

7-12 Why did Skinner’s ideas provoke controversy, and how might his operant conditioning principles be applied at school, in sports, at work, and at home?

B. F. Skinner stirred a hornet’s nest with his outspoken beliefs. He repeatedly insisted that external influences, not internal thoughts and feelings, shape behavior. And he urged people to use operant principles to influence others’ behavior at school, work, and home. Knowing that behavior is shaped by its results, he argued that we should use rewards to evoke more desirable behavior.

B. F. Skinner “I am sometimes asked, ‘Do you think of yourself as you think of the organisms you study?’ The answer is yes. So far as I know, my behavior at any given moment has been nothing more than the product of my genetic endowment, my personal history, and the current setting” (1983).

Skinner’s critics objected, saying that he dehumanized people by neglecting their personal freedom and by seeking to control their actions. Skinner’s reply: External consequences already haphazardly control people’s behavior. Why not administer those consequences toward human betterment? Wouldn’t reinforcers be more humane than the punishments used in homes, schools, and prisons? And if it is humbling to think that our history has shaped us, doesn’t this very idea also give us hope that we can shape our future?

To review and experience simulations of operant conditioning, visit LaunchPad’s PsychSim 6: Operant Conditioning and also PsychSim 6: Shaping.

Applications of Operant Conditioning

In later chapters, we will see how psychologists apply operant conditioning principles to help people reduce high blood pressure or gain social skills. Reinforcement technologies have also been used in schools, sports, workplaces, and homes, and these principles can support our self-improvement as well (Flora, 2004).

Computer-assisted learning Electronic technologies have helped realize Skinner’s goal of individually paced instruction with immediate feedback.

Christopher Halloran/Shutterstock

AT SCHOOL More than 50 years ago, Skinner and others worked toward a day when “machines and textbooks” would shape learning in small steps, by immediately reinforcing correct responses. Such machines and texts, they said, would revolutionize education and free teachers to focus on each student’s special needs. “Good instruction demands two things,” said Skinner (1989). “Students must be told immediately whether what they do is right or wrong and, when right, they must be directed to the step to be taken next.”

Skinner might be pleased to know that many of his ideals for education are now possible. Teachers used to find it difficult to pace material to each student’s rate of learning, and to provide prompt feedback. Online adaptive quizzing, such as the LearningCurve system available with this text, does both. Students move through quizzes at their own pace, according to their own level of understanding. And they get immediate feedback on their efforts, including personalized study plans.

Page 264

IN SPORTS The key to shaping behavior in athletic performance, as elsewhere, is first reinforcing small successes and then gradually increasing the challenge. Golf students can learn putting by starting with very short putts, and eventually, as they build mastery, stepping back farther and farther. Novice batters can begin with half swings at an oversized ball pitched from 10 feet away, giving them the immediate pleasure of smacking the ball. As the hitters’ confidence builds with their success and they achieve mastery at each level, the pitcher gradually moves back and eventually introduces a standard baseball and pitching distance. Compared with children taught by conventional methods, those trained by this behavioral method have shown faster skill improvement (Simek & O’Brien, 1981, 1988).

AT WORK Knowing that reinforcers influence productivity, many organizations have invited employees to share the risks and rewards of company ownership. Others have focused on reinforcing a job well done. Rewards are most likely to increase productivity if the desired performance is well-defined and achievable. The message for managers? Reward specific, achievable behaviors, not vaguely defined “merit.”

Operant conditioning also reminds us that reinforcement should be immediate. IBM legend Thomas Watson understood. When he observed an achievement, he wrote the employee a check on the spot (Peters & Waterman, 1982). But rewards need not be material, or lavish. An effective manager may simply walk the floor and sincerely affirm people for good work, or write notes of appreciation for a completed project. As Skinner said, “How much richer would the whole world be if the reinforcers in daily life were more effectively contingent on productive work?”

AT HOME Parent-training researchers have pointed out how much parents can learn from operant conditioning practices. By saying, “Get ready for bed” and then caving in to protests or defiance, parents reinforce such whining and arguing (Wierson & Forehand, 1994). Exasperated, they may then yell or gesture menacingly. When the child, now frightened, obeys, that reinforces the parents’ angry behavior. Over time, a destructive parent-child relationship develops.

To disrupt this cycle, parents should remember that basic rule of shaping: Notice people doing something right and affirm them for it. Give children attention and other reinforcers when they are behaving well. Target a specific behavior, reward it, and watch it increase. When children misbehave or are defiant, don’t yell at them or hit them. Simply explain the misbehavior and give them a time-out.

Finally, we can use operant conditioning in our own lives. To reinforce your own desired behaviors (perhaps to improve your study habits) and extinguish the undesired ones (to stop smoking, for example), psychologists suggest taking these steps:

State a realistic goal in measurable terms. You might, for example, aim to boost your study time by an hour a day.
Decide how, when, and where you will work toward your goal. Take time to plan. Those who specify how they will implement goals more often fulfill them (Gollwitzer & Oettingen, 2012).
Monitor how often you engage in your desired behavior. You might log your current study time, noting under what conditions you do and don’t study.
Page 265

Reinforce the desired behavior. To increase your study time, give yourself a reward (a snack or some activity you enjoy) only after you finish your extra hour of study. Agree with your friends that you will join them for weekend activities only if you have met your realistic weekly studying goal.
Reduce the rewards gradually. As your new behaviors become more habitual, give yourself a mental pat on the back instead of a cookie.

IMMERSIVE LEARNING Conditioning principles may also be applied in clinical settings. Explore some of these applications in LaunchPad’s How Would You Know If People Can Learn to Reduce Anxiety?

RETRIEVE IT

Question

Ethan constantly misbehaves at preschool even though his teacher scolds him repeatedly. Why does Ethan's misbehavior continue, and what can his teacher do to stop it?

ANSWER: If Ethan is seeking attention, the teacher's scolding may be reinforcing rather than punishing. To change Ethan's behavior, his teacher could offer reinforcement (such as praise) each time he behaves well. The teacher might encourage Ethan toward increasingly appropriate behavior through shaping, or by rephrasing rules as rewards instead of punishments. (“You can have a snack if you play nicely with the other children” [reward] rather than “You will not get a snack if you misbehave!” [punishment].)

Contrasting Classical and Operant Conditioning

7-13 How does operant conditioning differ from classical conditioning?

Both classical and operant conditioning are forms of associative learning. Both involve acquisition, extinction, spontaneous recovery, generalization, and discrimination. But these two forms of learning also differ. Through classical (Pavlovian) conditioning, we associate different stimuli we do not control, and we respond automatically (respondent behaviors) (TABLE 7.4). Through operant conditioning, we associate our own behaviors—which act on our environment to produce rewarding or punishing stimuli (operant behaviors)—with their consequences.

Table 7.5: TABLE 7.4
Comparison of Classical and Operant Conditioning

	Classical Conditioning	Operant Conditioning
Basic idea	Organism associates events	Organism associates behavior and resulting events
Response	Involuntary, automatic	Voluntary, operates on environment
Acquisition	Associating events; NS is paired with US and becomes CS	Associating response with a consequence (reinforcer or punisher)
Extinction	CR decreases when CS is repeatedly presented alone	Responding decreases when reinforcement stops
Spontaneous recovery	The reappearance, after a rest period, of an extinguished CR	The reappearance, after a rest period, of an extinguished response
Generalization	The tendency to respond to stimuli similar to the CS	Organism’s response to similar stimuli is also reinforced
Discrimination	The learned ability to distinguish between a CS and other stimuli that do not signal a US	Organism learns that certain responses, but not others, will be reinforced

“O! This learning, what a thing it is.”

William Shakespeare,
The Taming of the Shrew, 1597

As we shall next see, our biology and cognitive processes influence both classical and operant conditioning.

RETRIEVE IT

Question

Salivating in response to a tone paired with food is a(n) behavior; pressing a bar to obtain food is a(n) behavior.

●

◌

▣

7.2 Operant Conditioning

RETRIEVE IT

Question

Skinner’s Experiments

Shaping Behavior

Types of Reinforcers

RETRIEVE IT

Question

Reinforcement Schedules

RETRIEVE IT

Question

Punishment

RETRIEVE IT

Question

Question

Question

Skinner’s Legacy

Applications of Operant Conditioning

RETRIEVE IT

Question

Contrasting Classical and Operant Conditioning

RETRIEVE IT

Question

REVIEW Operant Conditioning

Learning Objectives

Question

Question

Question

Question

Question

Question

Question

Terms and Concepts to Remember

Question

Experience the Testing Effect

Question 7.7

Question 7.8

Question 7.9

Question 7.10

Question 7.11

Question 7.12

Question 7.13

Question 7.14