4.1 Clinical Assessment: How and Why Does the Client Behave Abnormally?

Assessment is simply the collecting of relevant information in an effort to reach a conclusion. It goes on in every realm of life. We make assessments when we decide what cereal to buy or which presidential candidate to vote for. College admissions officers, who have to select the “best” of the students applying to their college, depend on academic records, recommendations, achievement test scores, interviews, and application forms to help them decide. Employers, who have to predict which applicants are most likely to be effective workers, collect information from résumés, interviews, references, and perhaps on-the-job observations.

assessment The process of collecting and interpreting relevant information about a client or research participant.

Clinical assessment is used to determine whether, how, and why a person is behaving abnormally and how that person may be helped. It also enables clinicians to evaluate people’s progress after they have been in treatment for a while and decide whether the treatment should be changed. The specific tools that are used to do an assessment depend on the clinician’s theoretical orientation. Psychodynamic clinicians, for example, use methods that assess a client’s personality and probe for unconscious conflicts he or she may be experiencing. This kind of assessment, called a personality assessment, enables them to piece together a clinical picture in accordance with the principles of their model (De Saeger et al., 2014; Tackett et al., 2013). Behavioral and cognitive clinicians are more likely to use assessment methods that reveal specific dysfunctional behaviors and cognitions. The goal of this kind of assessment, called a behavioral assessment, is to produce a functional analysis of the person’s behaviors—an analysis of how the behaviors are learned and reinforced (Siu & Zhou, 2014; O’Brien & Carhart, 2011).

98

The hundreds of clinical assessment techniques and tools that have been developed fall into three categories: clinical interviews, tests, and observations. To be useful, these tools must be standardized and must have clear reliability and validity.

Characteristics of Assessment Tools

All clinicians must follow the same procedures when they use a particular type of assessment tool. To standardize such a tool is to set up common steps to be followed whenever it is administered. Similarly, clinicians must standardize the way they interpret the results of an assessment tool in order to be able to understand what a particular score means. They may standardize the scores of a test, for example, by first administering it to a group of research participants whose performance will then serve as a common standard, or norm, against which later individual scores can be measured. The group that initially takes the test must be typical of the larger population for whom the test is intended. If an aggressiveness test meant for the public at large were standardized on a group of Marines, for example, the resulting “norm” might turn out to be misleadingly high (Hogan, 2014).

standardization The process in which a test is administered to a large group of people whose performance then serves as a standard or norm against which any individual’s score can be measured.

Reliability refers to the consistency of assessment measures. A good assessment tool will always yield similar results in the same situation (Dehn, 2013; Wang et al., 2012). An assessment tool has high test-retest reliability, one kind of reliability, if it yields similar results every time it is given to the same people. If a woman’s responses on a particular test indicate that she is generally a heavy drinker, the test should produce a similar result when she takes it again a week later. To measure test–retest reliability, participants are tested on two occasions and the two scores are correlated (Holden & Bernstein, 2013). The higher the correlation (see Chapter 2), the greater the test’s reliability.

reliability A measure of the consistency of test or research results.

Reliable assessments Former National Basketball Association stars Clyde Drexler, James Worthy, Brent Barry, Dominique Wilkins, and Julius Erving served as judges at the 2011 All-Star slam dunk contest. Holding up their scores after each dunk, they displayed high interrater reliability and showed they still know a great dunk when they see one.

An assessment tool shows high interrater (or interjudge) reliability, another kind of reliability, if different judges independently agree on how to score and interpret it. True–false and multiple-choice tests yield consistent scores no matter who evaluates them, but other tests require that the evaluator make a judgment. Consider a test that requires the person to draw a copy of a picture, which a judge then rates for accuracy. Different judges may give different ratings to the same drawing.

Finally, an assessment tool must have validity: it must accurately measure what it is supposed to measure (Dehn, 2013; Wang et al., 2012). Suppose a weight scale reads 12 pounds every time a 10-pound bag of sugar is placed on it. Although the scale is reliable because its readings are consistent, those readings are not valid, or accurate.

validity A measure of the accuracy of a test’s or study’s results.

A given assessment tool may appear to be valid simply because it makes sense and seems reasonable. However, this sort of validity, called face validity, does not by itself mean that the instrument is trustworthy. A test for depression, for example, might include questions about how often a person cries. Because it makes sense that depressed people would cry, these test questions have face validity. It turns out, however, that many people cry a great deal for reasons other than depression, and some extremely depressed people do not cry at all. Thus an assessment tool should not be used unless it has high predictive validity or concurrent validity (Dehn, 2013; Osman et al., 2011).

99

Military concerns U.S. Army troops await their turn for psychological assessment at the Soldier Readiness Processing Center at Fort Hood, Texas. Many soldiers have developed significant psychological problems in recent years as a result of their repeated deployments to Iraq and Afghanistan, leading the Army to conduct assessments that might predict which individuals are particularly vulnerable to such reactions.

Predictive validity is a tool’s ability to predict future characteristics or behavior. Let’s say that a test has been developed to identify elementary schoolchildren who are likely to take up cigarette smoking in high school. The test gathers information about the children’s parents—their personal characteristics, smoking habits, and attitudes toward smoking—and on that basis identifies high-risk children. To establish the test’s predictive validity, investigators could administer it to a group of elementary school students, wait until they were in high school, and then check to see which children actually did become smokers.

Concurrent validity is the degree to which the measures gathered from one tool agree with the measures gathered from other assessment techniques. Participants’ scores on a new test designed to measure anxiety, for example, should correlate highly with their scores on other anxiety tests or with their behavior during clinical interviews.

How reliable and valid are the tests you take in school? What about the tests you see online and in print magazines?

Before any assessment technique can be fully useful, it must meet the requirements of standardization, reliability, and validity. No matter how insightful or clever a technique may be, clinicians cannot profitably use its results if they are uninterpretable, inconsistent, or inaccurate. Unfortunately, more than a few clinical assessment tools fall short, suggesting that at least some clinical assessments, too, miss their mark.

Clinical Interviews

Most of us feel instinctively that the best way to get to know people is to meet with them face to face. Under these circumstances, we can see them react to what we do and say, observe as well as listen as they answer, and generally get a sense of who they are. A clinical interview is just such a face-to-face encounter (Goldfinger & Pomerantz, 2014; Sommers-Flanagan & Sommers-Flanagan, 2013). If during a clinical interview a man looks as happy as can be while describing his sadness over the recent death of his mother, the clinician may suspect that the man actually has conflicting emotions about this loss.

100

Conducting the InterviewThe interview is often the first contact between client and clinician. Clinicians use it to collect detailed information about the person’s problems and feelings, lifestyle and relationships, and other personal history. They may also ask about the person’s expectations of therapy and motives for seeking it. The clinician who worked with Franco began with a face-to-face interview:

BETWEEN THE LINES

Famous Movie Clinicians

Dr. Banks (Side Effects, 2013)

Dr. Patel (Silver Linings Playbook, 2012)

Dr. Logue (The King’s Speech, 2010)

Dr. Cawley (Shutter Island, 2010)

Dr. Steele (Changeling, 2008)

Dr. Rosen (A Beautiful Mind, 2001)

Dr. Crowe (The Sixth Sense, 1999)

Dr. Maguire (Good Will Hunting, 1997)

Dr. Lecter (The Silence of the Lambs, 1991; Hannibal, 2001; and Red Dragon, 2002)

Dr. Marvin (What About Bob?, 1991)

Dr. Sayer (Awakenings, 1990)

Dr. Sobel (Analyze This, 1999; and Analyze That, 2002)

Dr. Berger (Ordinary People, 1980)

Dr. Dysart (Equus, 1977)

Nurse Ratched (One Flew Over the Cuckoo’s Nest, 1975)

Drs. Petersen and Murchison (Spellbound, 1945)

Franco arrived for his appointment in gray sweatpants and a T-shirt. His stubble suggested that he had not shaved, and the many food stains on his shirt indicated he had not washed it for quite some time. Franco spoke without emotion. He slouched into the chair, sending signals that he did not want to be there.

When pressed, he talked about his two-year relationship with Maria, who, at 25, was 13 years younger than he was. Franco had believed that he had met his future wife, but Maria’s domineering mother was unhappy about the age difference and kept telling her daughter that she could find someone better. Franco wanted Maria to stand up to her mother and to move in with him, but this was not easy for her to do. Believing that Maria’s mother had too much influence over her and frustrated that she would not commit to him, he had broken up with Maria during a fight. He soon realized that he had acted impulsively, but Maria refused to take him back.

When asked about his childhood, Franco described his father’s death in a gruesome car crash on his way to pick up 12-year-old Franco from soccer practice. Initially, his father had told Franco that he could not come get him from practice, but Franco “threw a tantrum” and his father agreed to rearrange his schedule. Franco believed himself responsible for his father’s death.

Franco stated that, over the years, his mother had encouraged this feeling of self-blame by complaining that she had been forced to “give up her life” to raise Franco alone. She was always nasty to Franco and nasty to every woman he later dated. She even predicted that Franco would “die alone.”

Franco described being very unhappy throughout his school years. He hated school and felt less smart than the other kids. On occasion, a teacher’s critique—meant as encouragement—left him unable to do his homework for days, and his grades suffered. He truly believed he was stupid. Similarly, later in life, he interpreted his rise to a position as bank manager as due entirely to hard work. “I know I’m not as smart as the others there.”

Franco explained that since the breakup with Maria, he had experienced more unhappiness than ever before. He often spent all night watching television. At the same time, he could barely pay attention to what was happening on the screen. He said that some days he actually forgot to eat. He had no wish to see his friends. At work, the days blurred into one another, distinguished only by a growing number of reprimands from his bank supervisors. He attributed these work problems to his basic lack of ability. His supervisors had simply figured out that he had not been good enough for the job all along.

Beyond gathering basic background data of this kind, clinical interviewers give special attention to those topics they consider most important (Sommers-Flanagan & Sommers-Flanagan, 2013; Segal, June, & Marty, 2010). Psychodynamic interviewers try to learn about the person’s needs and memories of past events and relationships. Behavioral interviewers try to pinpoint information about the stimuli that trigger responses and their consequences. Cognitive interviewers try to discover assumptions and interpretations that influence the person. Humanistic clinicians ask about the person’s self-evaluation, self-concept, and values. Biological clinicians look for signs of biochemical or brain dysfunction. And sociocultural interviewers ask about the family, social, and cultural environments.

101

Interviews can be either unstructured or structured (Madill, 2012). In an unstructured interview, the clinician asks mostly open-ended questions, perhaps as simple as “Would you tell me about yourself?” The lack of structure allows the interviewer to follow leads and explore relevant topics that could not be anticipated before the interview.

In a structured interview, clinicians ask prepared—mostly specific—questions. Sometimes they use a published interview schedule—a standard set of questions designed for all interviews. Many structured interviews include a mental status exam, a set of questions and observations that systematically evaluate the client’s awareness, orientation with regard to time and place, attention span, memory, judgment and insight, thought content and processes, mood, and appearance (Sommers-Flanagan & Sommers-Flanagan, 2013). A structured format ensures that clinicians will cover the same kinds of important issues in all of their interviews and enables them to compare the responses of different individuals.

mental status exam A set of interview questions and observations designed to reveal the degree and nature of a client’s abnormal functioning.

Although most clinical interviews have both unstructured and structured portions, many clinicians favor one kind over the other. Unstructured interviews typically appeal to psychodynamic and humanistic clinicians, while structured formats are widely used by behavioral and cognitive clinicians, who need to pinpoint behaviors, attitudes, or thinking processes that may underlie abnormal behavior (Segal & Hersen, 2010).

What Are the Limitations of Clinical Interviews?Although interviews often produce valuable information about people, there are limits to what they can accomplish. One problem is that they sometimes lack validity, or accuracy (Sommers-Flanagan & Sommers-Flanagan, 2013; Chang & Krosnick, 2010). Individuals may intentionally mislead in order to present themselves in a positive light or to avoid discussing embarrassing topics (Gold & Castillo, 2010). Or people may be unable to give an accurate report in their interviews. Individuals who suffer from depression, for example, take a pessimistic view of themselves and may describe themselves as poor workers or inadequate parents when that isn’t the case at all (Feliciano & Gum, 2010).

Interviewers too may make mistakes in judgments that slant the information they gather (Clinton, Fernandez, & Alicea, 2010). They usually rely too heavily on first impressions, for example, and give too much weight to unfavorable information about a client (Wu & Shi, 2005). Interviewer biases, including gender, race, and age biases, may also influence the interviewers’ interpretations of what a client says (Ungar et al., 2006).

Interviews, particularly unstructured ones, may also lack reliability (Sommers-Flanagan & Sommers-Flanagan, 2013; Davis et al., 2010). People respond differently to different interviewers, providing, for example, less information to a cold interviewer than to a warm and supportive one (Quas et al., 2007). Similarly, a clinician’s race, gender, age, and appearance may influence a client’s responses (Davis et al., 2010; Springman, Wherry, & Notaro, 2006).

Because different clinicians can obtain different answers and draw different conclusions even when they ask the same questions of the same person, some researchers believe that interviewing should be discarded as a tool of clinical assessment. As you’ll see, however, the two other kinds of clinical assessment methods also have serious limitations.

102

Clinical Tests

Clinical tests are devices for gathering information about a few aspects of a person’s psychological functioning, from which broader information about the person can be inferred. On the surface, it may look easy to design an effective test. Every month, magazines and Web sites present new tests that supposedly tell us about our personalities, relationships, sex lives, reactions to stress, or ability to succeed. Such tests might sound convincing, but most of them lack reliability, validity, and standardization. That is, they do not yield consistent, accurate information or say where we stand in comparison with others.

clinical test A device for gathering information about a few aspects of a person’s psychological functioning from which broader information about the person can be inferred.

The art of assessment Clinicians often view works of art as informal projective tests in which artists reveal their conflicts and mental stability. The sometimes bizarre cat portraits of early-twentieth-century artist Louis Wain, for example, have been interpreted as reflections of the psychosis with which he struggled for many years.

More than 500 clinical tests are currently in use throughout the United States. Clinicians use six kinds most often: projective tests, personality inventories, response inventories, psychophysiological tests, neurological and neuropsychological tests, and intelligence tests.

Projective TestsProjective tests require that clients interpret vague stimuli, such as inkblots or ambiguous pictures, or follow open-ended instructions such as “Draw a person.” Theoretically, when clues and instructions are so general, people will “project” aspects of their personality into the task (Hogan, 2014). Projective tests are used primarily by psychodynamic clinicians to help assess the unconscious drives and conflicts they believe to be at the root of abnormal functioning (McGrath & Carroll, 2012; Baer & Blais, 2010). The most widely used projective tests are the Rorschach test, the Thematic Apperception Test, sentence-completion tests, and drawings.

projective test A test consisting of ambiguous material that people interpret or respond to.

RORSCHACH TESTIn 1911 Hermann Rorschach, a Swiss psychiatrist, experimented with the use of inkblots in his clinical work. He made thousands of blots by dropping ink on paper and then folding the paper in half to create a symmetrical but wholly accidental design, such as the one shown in Figure 4-1. Rorschach found that everyone saw images in these blots. In addition, the images a viewer saw seemed to correspond in important ways with his or her psychological condition. People diagnosed with schizophrenia, for example, tended to see images that differed from those described by people experiencing depression.

Figure 4.1: figure 4-1
An inkblot similar to those used in the Rorschach test. In this test, individuals view and react to a total of 10 inkblot images.

Rorschach selected 10 inkblots and published them in 1921 with instructions for their use in assessment (see MindTech below). This set was called the Rorschach Psychodynamic Inkblot Test. Rorschach died just 8 months later, at the age of 37, but his work was continued by others, and his inkblots took their place among the most widely used projective tests of the twentieth century.

Despite its limitations, just about everyone has heard of the Rorschach. Why do you think it is so famous and popular?

Clinicians administer the “Rorschach,” as it is commonly called, by presenting one inkblot card at a time and asking respondents what they see, what the inkblot seems to be, or what it reminds them of. In the early years, Rorschach testers paid special attention to the themes and images that the inkblots brought to mind (Butcher, 2010; Weiner & Greene, 2008). Testers now also pay attention to the style of the responses: Do the clients view the design as a whole or see specific details? Do they focus on the blots or on the white spaces between them?

103

MindTech

Psychology’s Wiki Leaks?

In 2009, an emergency room physician posted the images of all 10 Rorschach cards, along with common responses to each card, on Wikipedia, the online encyclopedia. The publisher of the test, Hogrefe Publishing, immediately threatened to take Wikipedia to court, saying that the encyclopedia’s willingness to post the images was “unbelievably reckless” (Cohen, 2009). However, no legal actions took place, and to this day, the 10 cards remain on Wikipedia for the entire world to see.

Many psychologists have criticized the Wikipedia posting, arguing that the Rorschach test responses of patients who have previously seen the test on Wikipedia cannot be trusted. In support of their concerns, a recent study found that reading the Wikipedia Rorschach test article did indeed help many individuals perform more positively on the test itself (Schultz & Brabender, 2012). These clinical concerns are consistent with the long-standing positions of the British, Canadian, and American Psychological Associations, who hold that nonprofessional publications of psychological test answers are wrong and potentially harmful to patients (CPA, 2009; BPA, 2007; APA, 1996).

Still other critics point out that the online publication of the Rorschach cards jeopardizes the usefulness of thousands of published studies—studies that have tried to link patients’ Rorschach responses to particular psychological disorders (Cohen, 2009). These studies were conducted on first-time inkblot observers, not on people who had already viewed the cards online.

Why do you think this Rorschach debate has led to an increase in the distribution of psychological tests?

On the other hand, more than a few test skeptics seem very pleased by the online posting, hoping that it will lower the public’s regard for the test and lessen its clinical use (Radford, 2009). In fact, one recent study suggests that the Rorschach-Wikipedia debate has already led to unfavorable opinions of the test among many individuals (Schultz & Loving 2012).

It appears that this debate is actually leading to an increase—rather than a decrease—in the distribution of psychological tests. Several newspapers reporting on the controversy have themselves published photos of the Rorschach cards (Simple, 2009; White, 2009). And as you will read later in this chapter, intelligence tests, among the most widely used of all psychological tests, are now available—on eBay of all places—to anyone who is willing to pay the price.

THEMATIC APPERCEPTION TESTThe Thematic Apperception Test (TAT) is a pictorial projective test (Aronow, Weiss, & Reznikoff, 2011; Morgan & Murray, 1935). People who take the TAT are commonly shown 30 cards with black-and-white pictures of individuals in vague situations and are asked to make up a dramatic story about each card. They must tell what is happening in the picture, what led up to it, what the characters are feeling and thinking, and what the outcome of the situation will be.

104

Clinicians who use the TAT believe that people always identify with one of the characters on each card. The stories are thought to reflect the individuals’ own circumstances, needs, and emotions. For example, a female client seems to be revealing her own feelings when telling this story about a TAT picture similar to the image shown in Figure 4-2:

Figure 4.2: figure 4-2
A picture similar to one used in the Thematic Apperception Test.

This is a woman who has been quite troubled by memories of a mother she was resentful toward. She has feelings of sorrow for the way she treated her mother, her memories of her mother plague her. These feelings seem to be increasing as she grows older and sees her children treating her the same way that she treated her mother.

(Aiken, 1985, p. 372)

SENTENCE-COMPLETION TESTIn the sentence-completion test, first developed in the 1920s (Payne, 1928), the test-taker completes a series of unfinished sentences, such as “I wish …” or “My father….” The test is considered a good springboard for discussion and a quick and easy way to pinpoint topics to explore.

DRAWINGSOn the assumption that a drawing tells us something about its creator, clinicians often ask clients to draw human figures and talk about them (McGrath & Carroll, 2012). Evaluations of these drawings are based on the details and shape of the drawing, the solidity of the pencil line, the location of the drawing on the paper, the size of the figures, the features of the figures, the use of background, and comments made by the respondent during the drawing task. In the Draw-a-Person (DAP) test, the most popular of the drawing tests, individuals are first told to draw “a person” and then are instructed to draw a person of the other sex.

Drawing test Drawing tests are commonly used to assess the functioning of children. A popular one is the Kinetic Family Drawing test, in which children draw their household members performing some activity (“kinetic” means “active”).

WHAT ARE THE MERITS OF PROJECTIVE TESTS?Until the 1950s, projective tests were the most commonly used method for assessing personality. In recent years, however, clinicians and researchers have relied on them largely to gain “supplementary” insights (Hogan, 2014; McGrath & Carroll, 2012). One reason for this shift is that practitioners who follow the newer models have less use for the tests than psychodynamic clinicians do. Even more important, the tests have not consistently shown much reliability or validity (Hogan, 2014; Wood et al., 2002).

In reliability studies, different clinicians have tended to score the same person’s projective test quite differently. Similarly, in validity studies, when clinicians try to describe a client’s personality and feelings on the basis of responses to projective tests, their conclusions often fail to match the self-report of the client, the view of the psychotherapist, or the picture gathered from an extensive case history (Bornstein, 2007).

Another validity problem is that projective tests are sometimes biased against minority ethnic groups (Costantino, Dana, & Malgady, 2007) (see Table 4-1). For example, people are supposed to identify with the characters in the TAT when they make up stories about them, yet no members of minority groups are in the TAT pictures. In response to this problem, some clinicians have developed other TAT-like tests with African American or Hispanic figures (Costantino et al., 2007, 1992).

Table 4.1: table: 4-1Multicultural Hot Spots in Assessment and Diagnosis

Cultural Hot Spot

Effect on Assessment or Diagnosis

Immigrant Client

Dominant-Culture Assessor

Homeland culture may differ from current country’s dominant culture

May misread culture-bound reactions as pathology

May have left homeland to escape war or oppression

May overlook client’s vulnerability to posttraumatic stress

May have weak support systems in this country

May overlook client’s heightened vulnerability to stressors

Lifestyle (wealth and occupation) in this country may fall below lifestyle in homeland

May overlook client’s sense of loss and frustration

May refuse or be unable to learn dominant language

May misunderstand client’s assessment responses, or may overlook or misdiagnose client’s symptoms

Ethnic-Minority Client

Dominant-Culture Assessor

May reject or distrust members of dominant culture, including assessor

May experience little rapport with client, or may misinterpret client’s distrust as pathology

May be uncomfortable with dominant culture’s values (e.g., assertiveness, confrontation) and so find it difficult to apply clinician’s recommendations

May view client as unmotivated

May manifest stress in culture-bound ways (e.g., somatic symptoms such as stomachaches)

May misinterpret symptom patterns

May hold cultural beliefs that seem strange to dominant culture (e.g., belief in communication with dead)

May misinterpret cultural responses as pathology (e.g., a delusion)

May be uncomfortable during assessment

May overlook and feed into client’s discomfort

Dominant-Culture Assessor

Ethnic-Minority Client

May be unknowledgeable or biased about ethnic-minority culture

Cultural differences may be pathologized, or symptoms may be overlooked

May nonverbally convey own discomfort to ethnic-minority client

May become tense and anxious

Information from: Rose et al., 2011; Bhattacharya et al., 2010; Dana, 2005, 2000; Westermeyer, 2004, 2001, 1993; López & Guarnaccia, 2005, 2000; Kirmayer, 2003, 2002, 2001; Sue & Sue, 2003; Tsai et al., 2001; Thakker & Ward, 1998.

105

Personality InventoriesAn alternative way to collect information about individuals is to ask them to assess themselves. Respondents to a personality inventory answer a wide range of questions about their behavior, beliefs, and feelings. In the typical personality inventory, individuals indicate whether each of a long list of statements applies to them. Clinicians then use the responses to draw conclusions about the person’s personality and psychological functioning (Hogan, 2014; Watson, 2012).

personality inventory A test, designed to measure broad personality characteristics, consisting of statements about behaviors, beliefs, and feelings that people evaluate as either characteristic or uncharacteristic of them.

By far the most widely used personality inventory is the Minnesota Multiphasic Personality Inventory (MMPI) (Butcher, 2011). Two adult versions are available—the original test, published in 1945, and the MMPI-2, a 1989 revision which was itself revised in 2001. There is also a streamlined version of the inventory called the MMPI-2-Restructured Form which was developed in 2008 with the use of more rigorous statistical techniques than those employed in the MMPI and MMPI-2. Finally, a special version of the test for adolescents, the MMPI-A, is also used widely (Williams & Butcher, 2011).

The MMPI consists of more than 500 self-statements, to be labeled “true,” “false,” or “cannot say.” The statements cover issues ranging from physical concerns to mood, sexual behaviors, and social activities. Altogether the statements make up 10 clinical scales, on each of which an individual can score from 0 to 120. When people score above 70 on a scale, their functioning on that scale is considered deviant. When the 10 scale scores are considered side by side, a pattern called a profile takes shape, indicating the person’s general personality. The 10 scales on the MMPI measure the following:

106

Hypochondriasis Items showing abnormal concern with bodily functions (“I have chest pains several times a week.”)

Depression Items showing extreme pessimism and hopelessness (“I often feel hopeless about the future.”)

Hysteria Items suggesting that the person may use physical or mental symptoms as a way of unconsciously avoiding conflicts and responsibilities (“My heart frequently pounds so hard I can feel it.”)

Psychopathic deviate Items showing a repeated and gross disregard for social customs and an emotional shallowness (“My activities and interests are often criticized by others.”)

Masculinity-femininity Items that are thought to separate male and female respondents (“I like to arrange flowers.”)

Paranoia Items that show abnormal suspiciousness and delusions of grandeur or persecution (“There are evil people trying to influence my mind.”)

Psychasthenia Items that show obsessions, compulsions, abnormal fears, and guilt and indecisiveness (“I save nearly everything I buy, even after I have no use for it.”)

Schizophrenia Items that show bizarre or unusual thoughts or behavior (“Things around me do not seem real.”)

Hypomania Items that show emotional excitement, overactivity, and flight of ideas (“At times I feel very ‘high’ or very ‘low’ for no apparent reason.”)

Social introversion Items that show shyness, little interest in people, and insecurity (“I am easily embarrassed.”)

The MMPI-2, the newer version of the MMPI, contains 567 items—many identical to those in the original, some rewritten to reflect current language (“upset stomach,” for instance, replaces “acid stomach”), and others that are new. Before being adopted, the MMPI-2 was tested on a more diverse group of people than was the original MMPI. Thus scores on the revised test are thought to be more accurate indicators of personality and abnormal functioning (Butcher, 2011, 2010).

The MMPI and other personality inventories have several advantages over projective tests (Hogan, 2014; Ben-Porath, 2012; Watson 2012). Because they are computerized or paper-and-pencil tests, they do not take much time to administer, and they are objectively scored. Most of them are standardized, so one person’s scores can be compared with those of many others. Moreover, they often display greater test-retest reliability than projective tests (Zubeidat et al., 2011). For example, people who take the MMPI a second time after a period of less than 2 weeks receive approximately the same scores (Graham, 2014, 2006).

107

Personality inventories also appear to have more validity, or accuracy, than projective tests (Butcher, 2011, 2010; Lanyon, 2007). However, they can hardly be considered highly valid. When clinicians have used these tests alone, they have not regularly been able to judge a respondent’s personality accurately (Braxton et al., 2007). One problem is that the personality traits that the tests seek to measure cannot be examined directly. How can we fully know a person’s character, emotions, and needs from self-reports alone?

Another problem is that despite the use of more diverse standardization groups by the MMPI-2 designers, this and other personality tests continue to have certain cultural limitations. Responses that indicate a psychological disorder in one culture may be normal responses in another (Butcher, 2010; Dana, 2005, 2000). In Puerto Rico, for example, where it is common to practice spiritualism, it would be normal to answer “true” to the MMPI item “Evil spirits possess me at times.” In other populations, that response could indicate psychopathology (Rogler, Malgady, & Rodriguez, 1989).

Despite such limits in validity, personality inventories continue to be popular. Research indicates that they can help clinicians learn about people’s personal styles and disorders as long as they are used in combination with interviews or other assessment tools.

BETWEEN THE LINES

A New Employment Screening Tool

More than 40 percent of companies use social networking sites to help screen job candidates. Why? To see whether candidates present themselves professionally (65%), are good fits for the company’s culture (51%), are qualified (45%), and/or are well rounded (35%).

(CareerBuilder, 2012)

Response InventoriesLike personality inventories, response inventories ask people to provide detailed information about themselves, but these tests focus on one specific area of functioning (Watson, 2012; Blais & Baer, 2010). For example, one such test may measure affect (emotion), another social skills, and still another cognitive processes. Clinicians can use the inventories to determine the role such factors play in a person’s disorder.

response inventories Tests designed to measure a person’s responses in one specific area of functioning, such as affect, social skills, or cognitive processes.

Affective inventories measure the severity of such emotions as anxiety, depression, and anger (Osman et al., 2008). In one of the most widely used affective inventories, the Beck Depression Inventory—an excerpt of which is shown in Table 4-2—people rate their level of sadness and its effect on their functioning (Wang & Gorenstein, 2013). For social skills inventories, used particularly by behavioral and family-social clinicians, respondents indicate how they would react in a variety of social situations (Vaz et al., 2013; Norton et al., 2010). Cognitive inventories reveal a person’s typical thoughts and assumptions and can help uncover counterproductive patterns of thinking (Takei et al., 2011; Glass & Merluzzi, 2000). They are, not surprisingly, often used by cognitive therapists and researchers.

108

Table 4.2: table: 4-2Sample Items from the Beck Depression Inventory

Items

Inventory

Suicidal ideas

0

I don’t have any thoughts of killing myself.

 

1

I have thoughts of killing myself but I would not carry them out.

 

2

I would like to kill myself.

 

3

I would kill myself if I had the chance.

Work inhibition

0

I can work about as well as before.

 

1

It takes extra effort to get started at doing something.

 

2

I have to push myself very hard to do anything.

 

3

I can’t do any work at all.

Loss of libido

0

I have not noticed any recent change in my interest in sex.

 

1

I am less interested in sex than I used to be.

 

2

I am much less interested in sex now.

 

3

I have lost interest in sex completely.

Both the number of response inventories and the number of clinicians who use them have increased steadily in the past 30 years (Black, 2005). At the same time, however, these inventories have major limitations. With the notable exceptions of the Beck Depression Inventory and a few others, many of the tests have not been subjected to careful standardization, reliability, and validity procedures (Blais & Baer, 2010; Weis & Smenner, 2007). Often they are created as a need arises, without being tested for accuracy and consistency.

Blink of the eye Before entering combat duty, this Marine takes an eyeblink test—a psychophysiological test in which sensors are attached to the eyelid and other parts of the face. The test tries to detect physical indicators of tension and anxiety and to predict which Marines might be particularly susceptible to posttraumatic stress disorder.

Psychophysiological TestsClinicians may also use psychophysiological tests, which measure physiological responses as possible indicators of psychological problems (Daly et al., 2014; Rodriguez-Ruiz et al., 2012). This practice began three decades ago, after several studies suggested that states of anxiety are regularly accompanied by physiological changes, particularly increases in heart rate, body temperature, blood pressure, skin reactions (galvanic skin response), and muscle contractions. The measuring of physiological changes has since played a key role in the assessment of certain psychological disorders.

psychophysiological test A test that measures physical responses (such as heart rate and muscle tension) as possible indicators of psychological problems.

One psychophysiological test is the polygraph, popularly known as a lie detector (Rosky, 2013; Boucsein, 2012; Meijer & Verschuere, 2010). Electrodes attached to various parts of a person’s body detect changes in breathing, perspiration, and heart rate while the person answers questions. The clinician observes these functions while the person answers “yes” to control questions—questions whose answers are known to be yes, such as “Are both your parents alive?” Then the clinician observes the same physiological functions while the person answers test questions, such as “Did you commit this robbery?” If breathing, perspiration, and heart rate suddenly increase, the person is suspected of lying.

Why might an innocent person “fail” a lie detector test? How might a guilty person manage to “pass” the test?

Like other kinds of clinical tests, psychophysiological tests have their drawbacks (Rusconi & Mitchener-Nissen, 2013). Many require expensive equipment that must be carefully tuned and maintained. In addition, psychophysiological measurements can be inaccurate and unreliable (see PsychWatch below). The laboratory equipment itself—elaborate and sometimes frightening—may arouse a participant’s nervous system and thus change his or her physical responses. Physiological responses may also change when they are measured repeatedly in a single session. Galvanic skin responses, for example, often decrease during repeated testing.

The EEG Electrodes pasted to the scalp help measure the brain waves of this baby boy.

Neurological and Neuropsychological TestsSome problems in personality or behavior are caused primarily by damage to the brain or by changes in brain activity. Head injuries, brain tumors, brain malfunctions, alcoholism, infections, and other disorders can all cause such impairment. If a psychological dysfunction is to be treated effectively, it is important to know whether its primary cause is a physical abnormality in the brain.

A number of techniques may help pinpoint brain abnormalities. Some procedures, such as brain surgery, biopsy, and X ray, have been used for many years. More recently, scientists have developed a number of neurological tests, which are designed to measure brain structure and activity directly. One neurological test is the electroencephalogram (EEG), which records brain waves, the electrical activity that takes place within the brain as a result of neurons firing. In an EEG, electrodes placed on the scalp send brain-wave impulses to a machine that records them.

neurological test A test that directly measures brain structure or activity.

109

PsychWatch

The Truth, the Whole Truth, and Nothing but the Truth

All the rage A student learns to administer polygraph exams at the Latin American Polygraph Institute in Bogota, Colombia. Despite evidence that these tests are often invalid, they are widely used by businesses in Colombia, where deception by employees has become a major problem.

In movies, criminals being grilled by the police reveal their guilt by sweating, shaking, cursing, or twitching. When they are hooked up to a polygraph (a lie detector), the needles bounce all over the paper. This image has been with us since World War I, when some clinicians developed the theory that people who are telling lies display systemic changes in their breathing, perspiration, and heart rate (Marston, 1917).

The danger of relying on polygraph tests is that, according to researchers, they do not work as well as we would like (Rosky, 2013; Rusconi & Mitchener-Nissen, 2013; Meijer & Verschuere, 2010). The public did not pay much attention to this inconvenient fact until the mid-1980s, when the American Psychological Association officially reported that polygraphs were often inaccurate and the U.S. Congress voted to restrict their use in criminal prosecution and employment screening (Krapohl, 2002). Research indicates that 8 out of 100 truths, on average, are called lies in polygraph testing (Grubin, 2010; Raskin & Honts, 2002; MacLaren, 2001). Imagine, then, how many innocent people might be convicted of crimes if polygraph findings were taken as valid evidence in criminal trials.

Given such findings, polygraphs are less trusted and less popular today than they once were. For example, few courts now admit results from such tests as evidence of criminal guilt (Grubin, 2010; Daniels, 2002). Polygraph testing has by no means disappeared, however. The FBI uses it extensively, parole boards and probation offices routinely use it to help decide whether to release convicted offenders, and in public-sector hiring (such as for police officers), the use of polygraph screening may actually be on the increase (Meijer & Verschuere, 2010; Kokish et al., 2005).

Other neurological tests actually take “pictures” of brain structure or brain activity. These tests, called neuroimaging, or brain scanning, techniques, include computerized axial tomography (CAT scan or CT scan), in which X rays of the brain’s structure are taken at different angles and combined; positron emission tomography (PET scan), a computer-produced motion picture of chemical activity throughout the brain; and magnetic resonance imaging (MRI), a procedure that uses the magnetic property of certain hydrogen atoms in the brain to create a detailed picture of the brain’s structure.

neuroimaging techniques Neurological tests that provide images of brain structure or activity, such as CT scans, PET scans, and MRIs. Also called brain scans.

A more recent version of the MRI, functional magnetic resonance imaging (fMRI), converts MRI pictures of brain structures into detailed pictures of neuron activity, thus offering a picture of the functioning brain. In this procedure, an MRI scanner detects rapid changes in the flow or volume of blood in areas across the brain while an individual is experiencing emotions or performing specific cognitive tasks. By interpreting these blood changes as indications of neuron activity at sites throughout the brain, a computer then generates images of the brain areas that are active during the individual’s emotional experiences or cognitive behaviors, thus offering a picture of the functioning brain. Partly because fMRI-produced images of brain functioning are so much clearer than PET scan images, the fMRI has generated enormous enthusiasm among brain researchers since it was first developed in 1990.

110

Traditional scanning The most widely used neuroimaging techniques in clinical practice—the MRI (lower left), CAT, and PET—take pictures of the living brain. Here, an MRI scan (above left) reveals a large tumor, colored in orange; a CAT scan (above center) reveals a mass of blood within the brain; and a PET scan (above right) shows which areas of the brain are active (those colored in red, orange, and yellow) when an individual is being stimulated. Images clockwise from top left: Pallava Bagla/Corbis; Lester V. Bergman/Corbis; Roger Ressmeyer/Corbis; Glowimages/Corbis.

Though widely used, these techniques are sometimes unable to detect subtle brain abnormalities. Clinicians have therefore developed less direct but sometimes more revealing neuropsychological tests that measure cognitive, perceptual, and motor performances on certain tasks; clinicians interpret abnormal performances as an indicator of underlying brain problems (Hogan, 2014; Summers & Saunders, 2012). Brain damage is especially likely to affect visual perception, memory, and visual-motor coordination, so neuropsychological tests focus particularly on these areas. The famous Bender Visual-Motor Gestalt Test, for example, consists of nine cards, each displaying a simple geometrical design. Patients look at the designs one at a time and copy each one on a piece of paper. Later they try to redraw the designs from memory. Notable errors in accuracy by individuals older than 12 are thought to reflect organic brain impairment. Clinicians often use a battery, or series, of neuropsychological tests, each targeting a specific skill area (Flanagan, Ortiz, & Alfonso, 2013; Reitan & Wolfson, 2005, 1996).

neuropsychological test A test that detects brain impairment by measuring a person’s cognitive, perceptual, and motor performances.

Intelligence TestsAn early definition of intelligence described it as “the capacity to judge well, to reason well, and to comprehend well” (Binet & Simon, 1916, p. 192). Because intelligence is an inferred quality rather than a specific physical process, it can be measured only indirectly. In 1905, French psychologist Alfred Binet and his associate Théodore Simon produced an intelligence test consisting of a series of tasks requiring people to use various verbal and nonverbal skills. The general score derived from this and later intelligence tests is termed an intelligence quotient (IQ), so called because initially it represented the ratio of a person’s “mental” age to his or her “chronological” age, multiplied by 100.

intelligence test A test designed to measure a person’s intellectual ability.

intelligence quotient (IQ) An overall score derived from intelligence tests.

There are now more than 100 intelligence tests available. As you will see in Chapter 17, intelligence tests play a key role in the diagnosis of intellectual disability (mental retardation), and they can also help clinicians identify other problems (Hogan, 2014; Mishak, 2014; Dehn, 2013).

111

Intelligence tests are among the most carefully produced of all clinical tests (Bowden et al., 2011; Kellerman & Burry, 2007). Because they have been standardized on large groups of people, clinicians have a good idea how each individual’s score compares with the performance of the population at large. These tests have also shown very high reliability: people who repeat the same IQ test years later receive approximately the same score. Finally, the major IQ tests appear to have fairly high validity: children’s IQ scores often correlate with their performance in school, for example.

How might IQ scores be misused by school officials, parents, or other individuals? Why is society preoccupied with these scores?

Nevertheless, intelligence tests have some key shortcomings. Factors that have nothing to do with intelligence, such as low motivation or high anxiety, can greatly influence test performance (Chaudhry & Ready, 2012) (see MediaSpeak below). In addition, IQ tests may contain cultural biases in their language or tasks that place people of one background at an advantage over those of another (Goldfinger & Pomerantz, 2014; Tanzer, Hof, & Jackson, 2010). Similarly, members of some minority groups may have little experience with this kind of test, or they may be uncomfortable with test examiners of a majority ethnic background. Either way, their performances may suffer.

MediaSpeak

Intelligence Tests Too? eBay and the Public Good

Michelle Roberts, Associated Press

Intelligence tests … are for sale on eBay Inc.’s online auction site, and the test maker is worried they will be misused.

The series of Wechsler intelligence tests, made by San Antonio-based Harcourt Assessment, Inc., are supposed to be sold to and administered by only clinical psychologists and trained professionals.

The Wechsler Adult Intelligence Scale-Revised (WAIS-R) This widely used intelligence test has 11 sub-tests, which cover such areas as factual information, memory, vocabulary, arithmetic, design, and eye-hand coordination.

Given more than a million times a year nationwide, according to Harcourt, the intelligence tests often are among numerous tests ordered by prosecutors and defense attorneys to determine the mental competence of criminal defendants. A low IQ, for example, can be used to argue leniency in sentencing.

Schools use the tests to determine whether to place a student in a special program, whether for gifted or struggling students. Harcourt officials say they fear the tests for sale on eBay will be misused for coaching by lawyers or parents.

When free enterprise principles conflict with psychological well being, how should the matter be resolved?

But eBay has denied their request to restrict the sale of the tests. eBay officials say there is nothing illegal about selling the tests, and it cannot monitor every possible misuse of items sold through its network of 248 million buyers and sellers. [The tests continue to be available on eBay as of 2014.] Five of the tests were listed for sale … for about $175 to $900. The latest edition of the adult test, which retails for $939, was offered on eBay for $249.99.

“In order for it to maintain its integrity, there needs to be limited availability,” said [a] Harcourt spokesman…. “Misinterpreting the results [of questions and tasks on the tests], even without malicious intent, could lead to mistakes in assessing a child’s intelligence….”

IQ Tests for Sale on eBay by Michelle Roberts, The Associated Press, 12/18/2007. Used with permission of The Associated Press Copyright © 2014. All rights reserved.

112

Clinical Observations

In addition to interviewing and testing people, clinicians may systematically observe their behavior. In one technique, called naturalistic observation, clinicians observe clients in their everyday environments. In another, analog observation, they observe them in an artificial setting, such as a clinical office or laboratory. Finally, in self-monitoring, clients are instructed to observe themselves.

Naturalistic and Analog ObservationsNaturalistic clinical observations usually take place in homes, schools, institutions such as hospitals and prisons, or community settings. Most of them focus on parent–child, sibling–sibling, or teacher–child interactions and on fearful, aggressive, or disruptive behavior (Hughes, Bullock, & Coplan, 2013; Lindhiem, Bernard, & Dozier, 2011). Often such observations are made by participant observers—key people in the client’s environment—and reported to the clinician.

When naturalistic observations are not practical, clinicians may resort to analog observations, often aided by special equipment such as a video camera or one-way mirror (Lindhiem et al., 2011; Haynes, 2001). Analog observations often have focused on children interacting with their parents, married couples attempting to settle a disagreement, speech-anxious people giving a speech, and phobic people approaching an object they find frightening.

Although much can be learned from actually witnessing behavior, clinical observations have certain disadvantages. For one thing, they are not always reliable. It is possible for various clinicians who observe the same person to focus on different aspects of behavior, assess the person differently, and arrive at different conclusions (Meersand, 2011). Careful training of observers and the use of observer checklists can help reduce this problem.

An ideal observation Using a one-way mirror, a clinical observer is able to view a mother interacting with her child without distracting the duo or influencing their behaviors.

Similarly, observers may make errors that affect the validity, or accuracy, of their observations (Wilson et al., 2010; Aiken & Groth-Marnat, 2006). The observer may suffer from overload and be unable to see or record all of the important behaviors and events. Or the observer may experience observer drift, a steady decline in accuracy as a result of fatigue or of a gradual unintentional change in the standards used when an observation continues for a long period of time. Another possible problem is observer bias—the observer’s judgments may be influenced by information and expectations he or she already has about the person (Hróbjartsson et al., 2014; Pellegrini, 2011).

A client’s reactivity may also limit the validity of clinical observations; that is, his or her behavior may be affected by the very presence of the observer (Mowery et al., 2010; Norton et al., 2010). If schoolchildren are aware that someone special is watching them, for example, they may change their usual classroom behavior, perhaps in the hope of creating a good impression (Lane et al., 2011).

Finally, clinical observations may lack cross-situational validity. A child who behaves aggressively in school is not necessarily aggressive at home or with friends after school. Because behavior is often specific to particular situations, observations in one setting cannot always be applied to other settings (Kagan, 2007).

Self-MonitoringAs you saw earlier, personality and response inventories are tests in which individuals report their own behaviors, feelings, or cognitions. In a related assessment procedure, self-monitoring, people observe themselves and carefully record the frequency of certain behaviors, feelings, or thoughts as they occur over time (Huh et al., 2013; Wright & Truax, 2008). How frequently, for instance, does a drug user have an urge for drugs or a headache sufferer have a headache? Self-monitoring is especially useful in assessing behavior that occurs so infrequently that it is unlikely to be seen during other kinds of observations. It is also useful for behaviors that occur so frequently that any other method of observing them in detail would be impossible—for example, smoking, drinking, or other drug use. Finally, self-monitoring may be the only way to observe and measure private thoughts or perceptions.

113

BETWEEN THE LINES

In Their Words

“You can observe a lot just by watching.”

Yogi Berra

Like all other clinical assessment procedures, however, self-monitoring has drawbacks (Huh et al., 2013; Baranski, 2011). Here too validity is often a problem. People do not always manage or try to record their observations accurately. Furthermore, when people monitor themselves, they may change their behaviors unintentionally (Huh et al., 2013; Otten, 2004). Smokers, for example, often smoke fewer cigarettes than usual when they are monitoring themselves, and teachers give more positive and fewer negative comments to their students.