3.1 Clinical Assessment: How and Why Does the Client Behave Abnormally?

assessment The process of collecting and interpreting relevant information about a client or research participant.

Assessment is simply the collecting of relevant information in an effort to reach a conclusion. It goes on in every realm of life. We make assessments when we decide what cereal to buy or which presidential candidate to vote for. College admissions officers, who have to select the “best” of the students applying to their college, depend on academic records, recommendations, achievement test scores, interviews, and application forms to help them decide. Employers, who have to predict which applicants are most likely to be effective workers, collect information from résumés, interviews, references, and perhaps on-the-job observations.

Clinical assessment is used to determine whether, how, and why a person is behaving abnormally and how that person may be helped. It also enables clinicians to evaluate people’s progress after they have been in treatment for a while and decide whether the treatment should be changed. The hundreds of clinical assessment techniques and tools that have been developed fall into three categories: clinical interviews, tests, and observations. To be useful, these tools must be standardized and must have clear reliability and validity.

Characteristics of Assessment Tools

78

standardization The process in which a test is administered to a large group of people whose performance then serves as a standard or norm against which any individual’s score can be measured.

All clinicians must follow the same procedures when they use a particular type of assessment tool. To standardize such a tool is to set up common steps to be followed whenever it is administered. Similarly, clinicians must standardize the way they interpret the results of an assessment tool in order to be able to understand what a particular score means. They may standardize the scores of a test, for example, by first administering it to a group of research participants whose performance will then serve as a common standard, or norm, against which later individual scores can be measured. The group that initially takes the test must be typical of the larger population for whom the test is intended. If an aggressiveness test meant for the public at large were standardized on a group of Marines, for example, the resulting “norm” might turn out to be misleadingly high (Hogan, 2014).

image
Reliable assessments Former National Basketball Association stars Clyde Drexler, James Worthy, Brent Barry, Dominique Wilkins, and Julius Erving served as judges at the 2011 All-Star slam dunk contest. Holding up their scores after each dunk, they displayed high interrater reliability and showed they still know a great dunk when they see one.

reliability A measure of the consistency of test or research results.

Reliability refers to the consistency of assessment measures. A good assessment tool will always yield similar results in the same situation (Dehn, 2013). An assessment tool has high test–retest reliability, one kind of reliability, if it yields similar results every time it is given to the same people. If a woman’s responses on a particular test indicate that she is generally a heavy drinker, the test should produce a similar result when she takes it again a week later. To measure test–retest reliability, participants are tested on two occasions and the two scores are correlated (Holden & Bernstein, 2013). The higher the correlation (see Chapter 1), the greater the test’s reliability.

An assessment tool shows high interrater (or interjudge) reliability, another kind of reliability, if different judges independently agree on how to score and interpret it. True–false and multiple-choice tests yield consistent scores no matter who evaluates them, but other tests require that the evaluator make a judgment. Consider a test that requires the person to draw a copy of a picture, which a judge then rates for accuracy. Different judges may give different ratings to the same drawing.

validity A measure of the accuracy of a test’s or study’s results.

Finally, an assessment tool must have validity: it must accurately measure what it is supposed to measure (Dehn, 2013). Suppose a weight scale reads 12 pounds every time a 10-pound bag of sugar is placed on it. Although the scale is reliable because its readings are consistent, those readings are not valid, or accurate.

A given assessment tool may appear to be valid simply because it makes sense and seems reasonable. However, this sort of validity, called face validity, does not by itself mean that the instrument is trustworthy. A test for depression, for example, might include questions about how often a person cries. Because it makes sense that depressed people would cry, these test questions have face validity. It turns out, however, that many people cry a great deal for reasons other than depression, and some extremely depressed people do not cry at all. Thus an assessment tool should not be used unless it has high predictive validity or concurrent validity (Dehn, 2013).

Predictive validity is a tool’s ability to predict future characteristics or behavior. Let’s say that a test has been developed to identify elementary schoolchildren who are likely to take up cigarette smoking in high school. The test gathers information about the children’s parents—their personal characteristics, smoking habits, and attitudes toward smoking—and on that basis identifies high-risk children. To establish the test’s predictive validity, investigators could administer it to a group of elementary school students, wait until they were in high school, and then check to see which children actually did become smokers.

Concurrent validity is the degree to which the measures gathered from one tool agree with the measures gathered from other assessment techniques. Participants’ scores on a new test designed to measure anxiety, for example, should correlate highly with their scores on other anxiety tests or with their behavior during clinical interviews.

79

How reliable and valid are the tests you take in school? What about the tests you see online and in print magazines?

Before any assessment technique can be fully useful, it must meet the requirements of standardization, reliability, and validity. No matter how insightful or clever a technique may be, clinicians cannot profitably use its results if they are uninterpretable, inconsistent, or inaccurate. Unfortunately, more than a few clinical assessment tools fall short, suggesting that at least some clinical assessments, too, miss their mark.

Clinical Interviews

Most of us feel instinctively that the best way to get to know people is to meet with them face to face. Under these circumstances, we can see them react to what we do and say, observe as well as listen as they answer, and generally get a sense of who they are. A clinical interview is just such a face-to-face encounter (Miller, 2015; Goldfinger & Pomerantz, 2014). If during a clinical interview a man looks as happy as can be while describing his sadness over the recent death of his mother, the clinician may suspect that the man actually has conflicting emotions about this loss.

Conducting the Interview The interview is often the first contact between client and clinician. Clinicians use it to collect detailed information about the person’s problems and feelings, lifestyle and relationships, and other personal history. They may also ask about the person’s expectations of therapy and motives for seeking it. The clinician who worked with Franco began with a face-to-face interview:

Franco arrived for his appointment in gray sweatpants and a T-shirt. His stubble suggested that he had not shaved, and the many food stains on his shirt indicated he had not washed it for quite some time. Franco spoke without emotion. He slouched into the chair, sending signals that he did not want to be there.

When pressed, he talked about his two-year relationship with Maria, who, at 25, was 13 years younger than he was. Franco had believed that he had met his future wife, but Maria’s domineering mother was unhappy about the age difference and kept telling her daughter that she could find someone better. Franco wanted Maria to stand up to her mother and to move in with him, but this was not easy for her to do. Believing that Maria’s mother had too much influence over her and frustrated that she would not commit to him, he had broken up with Maria during a fight. He soon realized that he had acted impulsively, but Maria refused to take him back.

BETWEEN THE LINES

The Stigma Continues

33% Americans who would not seek counseling for fear of being labeled “mentally ill”
67% Americans who would not tell their employer that they were seeking mental health treatment
37% Americans who would be reluctant to seek treatment because of confidentiality concerns

(Opinion Research Corporation, 2011, 2004)

When asked about his childhood, Franco described his father’s death in a gruesome car crash on his way to pick up 12-year-old Franco from soccer practice. Initially, his father had told Franco that he could not come get him from practice, but Franco “threw a tantrum” and his father agreed to rearrange his schedule. Franco believed himself responsible for his father’s death.

Franco stated that, over the years, his mother had encouraged this feeling of self-blame by complaining that she had been forced to “give up her life” to raise Franco alone. She was always nasty to Franco and nasty to every woman he later dated. She even predicted that Franco would “die alone.”

Franco described being very unhappy throughout his school years. He hated school and felt less smart than the other kids. On occasion, a teacher’s critique—meant as encouragement—left him unable to do his homework for days, and his grades suffered. He truly believed he was stupid. Similarly, later in life, he interpreted his rise to a position as bank manager as due entirely to hard work. “I know I’m not as smart as the others there.”

Franco explained that since the breakup with Maria, he had experienced more unhappiness than ever before. He often spent all night watching television. At the same time, he could barely pay attention to what was happening on the screen. He said that some days he actually forgot to eat. He had no wish to see his friends. At work, the days blurred into one another, distinguished only by a growing number of reprimands from his bank supervisors. He attributed these work problems to his basic lack of ability. His supervisors had simply figured out that he had not been good enough for the job all along.

80

BETWEEN THE LINES

A New Employment Screening Tool

More than 40 percent of companies use social networking sites to help screen job candidates. Why? To see whether candidates present themselves professionally (65%), are good fits for the company’s culture (51%), are qualified (45%), and/or are well rounded (35%).

(CareerBuilder, 2012)

Beyond gathering basic background data of this kind, clinical interviewers give special attention to those topics they consider most important (Miller, 2015; Segal, June, & Marty, 2010). Psychodynamic interviewers try to learn about the person’s needs and memories of past events and relationships. Behavioral interviewers try to pinpoint information about the stimuli that trigger responses and their consequences. Cognitive interviewers try to discover assumptions and interpretations that influence the person. Humanistic clinicians ask about the person’s self-evaluation, self-concept, and values. Biological clinicians look for signs of biochemical or brain dysfunction. And sociocultural interviewers ask about the family, social, and cultural environments.

Interviews can be either unstructured or structured. In an unstructured interview, the clinician asks mostly open-ended questions, perhaps as simple as “Would you tell me about yourself?” The lack of structure allows the interviewer to follow leads and explore relevant topics that could not be anticipated before the interview.

mental status exam A set of interview questions and observations designed to reveal the degree and nature of a client’s abnormal functioning.

In a structured interview, clinicians ask prepared—mostly specific—questions. Sometimes they use a published interview schedule—a standard set of questions designed for all interviews. Many structured interviews include a mental status exam, a set of questions and observations that systematically evaluate the client’s awareness, orientation with regard to time and place, attention span, memory, judgment and insight, thought content and processes, mood, and appearance (Sommers-Flanagan & Sommers-Flanagan, 2013). A structured format ensures that clinicians will cover the same kinds of important issues in all of their interviews and enables them to compare the responses of different individuals.

image
Military concerns U.S. Army troops await their turn for psychological assessment at the Soldier Readiness Processing Center at Fort Hood, Texas. Many soldiers have developed significant psychological problems in recent years as a result of their repeated deployments to Iraq and Afghanistan, leading the Army to conduct assessments that might predict which individuals are particularly vulnerable to such reactions.

81

Although most clinical interviews have both unstructured and structured portions, many clinicians favor one kind over the other. Unstructured interviews typically appeal to psychodynamic and humanistic clinicians, while structured formats are widely used by behavioral and cognitive clinicians, who need to pinpoint behaviors, attitudes, or thinking processes that may underlie abnormal behavior (Segal & Hersen, 2010).

image

What Are the Limitations of Clinical Interviews? Although interviews often produce valuable information about people, there are limits to what they can accomplish. One problem is that they sometimes lack validity, or accuracy (Sommers-Flanagan & Sommers-Flanagan, 2013). Individuals may intentionally mislead in order to present themselves in a positive light or to avoid discussing embarrassing topics (Gold & Castillo, 2010). Or people may be unable to give an accurate report in their interviews. Individuals who suffer from depression, for example, take a pessimistic view of themselves and may describe themselves as poor workers or inadequate parents when that isn’t the case at all.

Interviewers too may make mistakes in judgments that slant the information they gather (Clinton, Fernandez, & Alicea, 2010). They usually rely too heavily on first impressions, for example, and give too much weight to unfavorable information about a client (Wu & Shi, 2005). Interviewer biases, including gender, race, and age biases, may also influence the interviewers’ interpretations of what a client says (Ungar et al., 2006).

Interviews, particularly unstructured ones, may also lack reliability (Sommers-Flanagan & Sommers-Flanagan, 2013). People respond differently to different interviewers, providing, for example, less information to a cold interviewer than to a warm and supportive one (Quas et al., 2007). Similarly, a clinician’s race, gender, age, and appearance may influence a client’s responses (Davis et al., 2010; Springman, Wherry, & Notaro, 2006).

Because different clinicians can obtain different answers and draw different conclusions even when they ask the same questions of the same person, some researchers believe that interviewing should be discarded as a tool of clinical assessment. As you’ll see, however, the two other kinds of clinical assessment methods also have serious limitations.

Clinical Tests

clinical test A device for gathering information about a few aspects of a person’s psychological functioning from which broader information about the person can be inferred.

Clinical tests are devices for gathering information about a few aspects of a person’s psychological functioning, from which broader information about the person can be inferred. On the surface, it may look easy to design an effective test. Every month, magazines and Web sites present new tests that supposedly tell us about our personalities, relationships, sex lives, reactions to stress, or ability to succeed. Such tests might sound convincing, but most of them lack reliability, validity, and standardization. That is, they do not yield consistent, accurate information or say where we stand in comparison with others.

More than 500 clinical tests are currently in use throughout the United States. Clinicians use six kinds most often: projective tests, personality inventories, response inventories, psychophysiological tests, neurological and neuropsychological tests, and intelligence tests.

projective test A test consisting of ambiguous material that people interpret or respond to.

Projective Tests Projective tests require that clients interpret vague stimuli, such as inkblots or ambiguous pictures, or follow open-ended instructions such as “Draw a person.” Theoretically, when clues and instructions are so general, people will “project” aspects of their personality into the task (Cherry, 2015; Hogan, 2014). Projective tests are used primarily by psychodynamic clinicians to help assess the unconscious drives and conflicts they believe to be at the root of abnormal functioning (Baer & Blais, 2010). The most widely used projective tests are the Rorschach test, the Thematic Apperception Test, sentence-completion tests, and drawings.

82

image
Figure 3.1: figure 3.1 An inkblot similar to those used in the Rorschach test. In this test, individuals view and react to a total of 10 inkblot images.

RORSCHACH TEST In 1911 Hermann Rorschach, a Swiss psychiatrist, experimented with the use of inkblots in his clinical work. He made thousands of blots by dropping ink on paper and then folding the paper in half to create a symmetrical but wholly accidental design, such as the one shown in Figure 3.1. Rorschach found that everyone saw images in these blots. In addition, the images a viewer saw seemed to correspond in important ways with his or her psychological condition. People diagnosed with schizophrenia, for example, tended to see images that differed from those described by people experiencing depression.

Despite its limitations, just about everyone has heard of the Rorschach. Why do you think it is so famous and popular?

Rorschach selected 10 inkblots and published them in 1921 with instructions for their use in assessment. This set was called the Rorschach Psychodynamic Inkblot Test. Rorschach died just 8 months later, at the age of 37, but his work was continued by others, and his inkblots took their place among the most widely used projective tests of the twentieth century (see MindTech below).

Clinicians administer the “Rorschach,” as it is commonly called, by presenting one inkblot card at a time and asking respondents what they see, what the inkblot seems to be, or what it reminds them of. In the early years, Rorschach testers paid special attention to the themes and images that the inkblots brought to mind (Butcher, 2010). Testers now also pay attention to the style of the responses: Do the clients view the design as a whole or see specific details? Do they focus on the blots or on the white spaces between them?

image
Figure 3.2: figure 3.2 A picture similar to one used in the Thematic Apperception Test.

THEMATIC APPERCEPTION TEST The Thematic Apperception Test (TAT) is a pictorial projective test (Aronow, Weiss, & Reznikoff, 2011; Morgan & Murray, 1935). People who take the TAT are commonly shown 30 cards with black-and-white pictures of individuals in vague situations and are asked to make up a dramatic story about each card. They must tell what is happening in the picture, what led up to it, what the characters are feeling and thinking, and what the outcome of the situation will be.

Clinicians who use the TAT believe that people always identify with one of the characters on each card. The stories are thought to reflect the individuals’ own circumstances, needs, and emotions. For example, a female client seems to be revealing her own feelings when telling this story about a TAT picture similar to the image shown in Figure 3.2:

This is a woman who has been quite troubled by memories of a mother she was resentful toward. She has feelings of sorrow for the way she treated her mother, her memories of her mother plague her. These feelings seem to be increasing as she grows older and sees her children treating her the same way that she treated her mother.

(Aiken, 1985, p. 372)

SENTENCE-COMPLETION TEST In the sentence-completion test, first developed in the 1920s (Payne, 1928), the test-taker completes a series of unfinished sentences, such as “I wish …” or “My father….” The test is considered a good springboard for discussion and a quick and easy way to pinpoint topics to explore.

MindTech

image Psychology’s Wiki Leaks?

83

In 2009, an emergency room physician posted the images of all 10 Rorschach cards, along with common responses to each card, on Wikipedia, the online encyclopedia. The publisher of the test, Hogrefe Publishing, immediately threatened to take Wikipedia to court, saying that the encyclopedia’s willingness to post the images was “unbelievably reckless” (Cohen, 2009). However, no legal actions took place, and to this day, the 10 cards remain on Wikipedia for the entire world to see.

image

Many psychologists have criticized the Wikipedia posting, arguing that the Rorschach test responses of patients who have previously seen the test on Wikipedia cannot be trusted. In support of their concerns, a recent study found that reading the Wikipedia Rorschach test article did indeed help many individuals perform more positively on the test itself (Schultz & Brabender, 2012). These clinical concerns are consistent with the long-standing positions of the British, Canadian, and American Psychological Associations, who hold that nonprofessional publications of psychological test answers are wrong and potentially harmful to patients (CPA, 2009; BPS, 2007; APA, 1996).

Still other critics point out that the online publication of the Rorschach cards jeopardizes the usefulness of thousands of published studies—studies that have tried to link patients’ Rorschach responses to particular psychological disorders (Cohen, 2009). These studies were conducted on first-time inkblot observers, not on people who had already viewed the cards online.

Why do you think this Rorschach debate has led to an increase in the distribution of psychological tests?

On the other hand, more than a few test skeptics seem very pleased by the online posting, hoping that it will lower the public’s regard for the test and lessen its clinical use (Radford, 2009). In fact, one recent study suggests that the Rorschach–Wikipedia debate has already led to unfavorable opinions of the test among many individuals (Schultz & Loving, 2012).

It appears that this debate is actually leading to an increase—rather than a decrease—in the distribution of psychological tests. Several newspapers reporting on the controversy have themselves published photos of the Rorschach cards (Simple, 2009; White, 2009). And as you will read later in this chapter, intelligence tests, among the most widely used of all psychological tests, are now available—on eBay of all places—to anyone who is willing to pay the price.

image
Drawing test Drawing tests are commonly used to assess the functioning of children. One is the Kinetic Family Drawing test, in which children draw their household members performing some activity (“kinetic” means “active”).

DRAWINGS On the assumption that a drawing tells us something about its creator, clinicians often ask clients to draw human figures and talk about them (McGrath & Carroll, 2012). Evaluations of these drawings are based on the details and shape of the drawing, the solidity of the pencil line, the location of the drawing on the paper, the size of the figures, the features of the figures, the use of background, and comments made by the respondent during the drawing task. In the Draw-a-Person (DAP) test, the most popular of the drawing tests, individuals are first told to draw “a person” and then are instructed to draw a person of the other sex.

84

WHAT ARE THE MERITS OF PROJECTIVE TESTS? Until the 1950s, projective tests were the most commonly used method for assessing personality. In recent years, however, clinicians and researchers have relied on them largely to gain “supplementary” insights (McGrath & Carroll, 2012). One reason for this shift is that practitioners who follow the newer models have less use for the tests than psychodynamic clinicians do. Even more important, the tests have not consistently shown much reliability or validity (Hogan, 2014).

image
The art of assessment In the spirit of projective tests, the sometimes bizarre cat portraits of early-twentieth-century artist Louis Wain have been interpreted as reflections of the psychosis with which he struggled for many years.

In reliability studies, different clinicians have tended to score the same person’s projective test quite differently. Similarly, in validity studies, when clinicians try to describe a client’s personality and feelings on the basis of responses to projective tests, their conclusions often fail to match the self-report of the client, the view of the psychotherapist, or the picture gathered from an extensive case history (Cherry, 2015; Bornstein, 2007).

Another validity problem is that projective tests are sometimes biased against minority ethnic groups (Costantino et al., 2007) (see Table 3.1). For example, people are supposed to identify with the characters in the TAT when they make up stories about them, yet no members of minority groups are in the TAT pictures. In response to this problem, some clinicians have developed other TAT-like tests with African American or Hispanic figures (Costantino et al., 2007, 1992).

image

personality inventory A test, designed to measure broad personality characteristics, consisting of statements about behaviors, beliefs, and feelings that people evaluate as either characteristic or uncharacteristic of them.

Personality Inventories An alternative way to collect information about individuals is to ask them to assess themselves. Respondents to a personality inventory answer a wide range of questions about their behavior, beliefs, and feelings. In the typical personality inventory, individuals indicate whether each of a long list of statements applies to them. Clinicians then use the responses to draw conclusions about the person’s personality and psychological functioning (Hogan, 2014; Watson, 2012).

By far the most widely used personality inventory is the Minnesota Multiphasic Personality Inventory (MMPI) (Butcher, 2011). Two adult versions are available—the original test, published in 1945, and the MMPI-2, a 1989 revision that was itself revised in 2001. There is also a streamlined version of the inventory called the MMPI-2-Restructured Form and a special version of the test for adolescents called the MMPI-A (Williams & Butcher, 2011).

The MMPI consists of more than 500 self-statements, to be labeled “true,” “false,” or “cannot say.” The statements cover issues ranging from physical concerns to mood, sexual behaviors, and social activities. Altogether the statements make up 10 clinical scales, on each of which an individual can score from 0 to 120. When people score above 70 on a scale, their functioning on that scale is considered deviant. When the 10 scale scores are considered side by side, a pattern called a profile takes shape, indicating the person’s general personality. The 10 scales on the MMPI measure the following:

Hypochondriasis Items showing abnormal concern with bodily functions (“I have chest pains several times a week.”)

Depression Items showing extreme pessimism and hopelessness (“I often feel hopeless about the future.”)

Hysteria Items suggesting that the person may use physical or mental symptoms as a way of unconsciously avoiding conflicts and responsibilities (“My heart frequently pounds so hard I can feel it.”)

Psychopathic deviate Items showing a repeated and gross disregard for social customs and an emotional shallowness (“My activities and interests are often criticized by others.”)

Masculinity-femininity Items that are thought to separate male and female respondents (“I like to arrange flowers.”)

85

Paranoia Items that show abnormal suspiciousness and delusions of grandeur or persecution (“There are evil people trying to influence my mind.”)

Psychasthenia Items that show obsessions, compulsions, abnormal fears, and guilt and indecisiveness (“I save nearly everything I buy, even after I have no use for it.”)

Schizophrenia Items that show bizarre or unusual thoughts or behavior (“Things around me do not seem real.”)

Hypomania Items that show emotional excitement, overactivity, and flight of ideas (“At times I feel very ‘high’ or very ‘low’ for no apparent reason.”)

Social introversion Items that show shyness, little interest in people, and insecurity (“I am easily embarrassed.”)

The MMPI and other personality inventories have several advantages over projective tests (Cherry, 2015; Hogan, 2014). Because they are computerized or paper-and-pencil tests, they do not take much time to administer, and they are objectively scored. Most of them are standardized, so one person’s scores can be compared with those of many others. Moreover, they often display greater test–retest reliability than projective tests. For example, people who take the MMPI a second time after a period of less than 2 weeks receive approximately the same scores (Graham, 2014, 2006).

86

image

Personality inventories also appear to have more validity, or accuracy, than projective tests (Cherry, 2015; Butcher, 2011, 2010). However, they can hardly be considered highly valid. When clinicians have used these tests alone, they have not regularly been able to judge a respondent’s personality accurately (Braxton et al., 2007). One problem is that the personality traits that the tests seek to measure cannot be examined directly. How can we fully know a person’s character, emotions, and needs from self-reports alone?

Another problem is that despite the use of more diverse standardization groups by the MMPI-2 designers, this and other personality tests continue to have certain cultural limitations. Responses that indicate a psychological disorder in one culture may be normal responses in another (Butcher, 2010; Dana, 2005, 2000). In Puerto Rico, for example, where it is common to practice spiritualism, it would be normal to answer “true” to the MMPI item “Evil spirits possess me at times.” In other populations, that response could indicate psychopathology (Rogler et al., 1989).

Despite such limits in validity, personality inventories continue to be popular. Research indicates that they can help clinicians learn about people’s personal styles and disorders as long as they are used in combination with interviews or other assessment tools.

response inventories Tests that measure a person’s responses in one specific area of functioning, such as affect, social skills, or cognitive processes.

Response Inventories Like personality inventories, response inventories ask people to provide detailed information about themselves, but these tests focus on one specific area of functioning (Wang & Gorenstein, 2013; Vaz et al., 2013; Watson, 2012). For example, one such test may measure affect (emotion), another social skills, and still another cognitive processes. Clinicians can use the inventories to determine the role such factors play in a person’s disorder.

Affective inventories measure the severity of such emotions as anxiety, depression, and anger. In one of the most widely used affective inventories, the Beck Depression Inventory, people rate their level of sadness and its effect on their functioning. For social skills inventories, used particularly by behavioral and family-social clinicians, respondents indicate how they would react in a variety of social situations. Cognitive inventories reveal a person’s typical thoughts and assumptions and can help uncover counterproductive patterns of thinking. They are, not surprisingly, often used by cognitive therapists and researchers.

image
Blink of the eye Before entering combat duty, this Marine takes an eyeblink test—a psychophysiological test in which sensors are attached to the eyelid and other parts of the face. The test tries to detect physical indicators of tension and anxiety and to predict which Marines might be particularly susceptible to posttraumatic stress disorder.

Both the number of response inventories and the number of clinicians who use them have increased steadily in the past 30 years. At the same time, however, these inventories have major limitations. With the notable exceptions of the Beck Depression Inventory and a few others, many of the tests have not been subjected to careful standardization, reliability, and validity procedures (Blais & Baer, 2010). Often they are created as a need arises, without being tested for accuracy and consistency.

psychophysiological test A test that measures physical responses (such as heart rate and muscle tension) as possible indicators of psychological problems.

Psychophysiological Tests Clinicians may also use psychophysiological tests, which measure physiological responses as possible indicators of psychological problems (Daly et al., 2014). This practice began three decades ago, after several studies suggested that states of anxiety are regularly accompanied by physiological changes, particularly increases in heart rate, body temperature, blood pressure, skin reactions (galvanic skin response), and muscle contractions. The measuring of physiological changes has since played a key role in the assessment of certain psychological disorders.

Why might an innocent person “fail” a lie detector test? How might a guilty person manage to “pass” the test?

One psychophysiological test is the polygraph, popularly known as a lie detector (Bhutta et al., 2015; Rosky, 2013). Electrodes attached to various parts of a person’s body detect changes in breathing, perspiration, and heart rate while the person answers questions. The clinician observes these functions while the person answers “yes” to control questions—questions whose answers are known to be yes, such as “Are both your parents alive?” Then the clinician observes the same physiological functions while the person answers test questions, such as “Did you commit this robbery?” If breathing, perspiration, and heart rate suddenly increase, the person is suspected of lying.

87

Like other kinds of clinical tests, psychophysiological tests have their drawbacks (Rusconi & Mitchener-Nissen, 2013). Many require expensive equipment that must be carefully tuned and maintained. In addition, psychophysiological measurements can be inaccurate and unreliable (see PsychWatch below). The laboratory equipment itself—elaborate and sometimes frightening—may arouse a participant’s nervous system and thus change his or her physical responses. Physiological responses may also change when they are measured repeatedly in a single session. Galvanic skin responses, for example, often decrease during repeated testing.

PsychWatch

The Truth, the Whole Truth, and Nothing but the Truth

In movies, criminals being grilled by the police reveal their guilt by sweating, shaking, cursing, or twitching. When they are hooked up to a polygraph (a lie detector), the needles bounce all over the paper. This image has been with us since World War I, when some clinicians developed the theory that people who are telling lies display systemic changes in their breathing, perspiration, and heart rate (Marston, 1917).

image
All the rage A student learns to administer polygraph exams at the Latin American Polygraph Institute in Bogota, Colombia. Despite evidence that these tests are often invalid, they are widely used by businesses in Colombia, where deception by employees has become a major problem.

The danger of relying on polygraph tests is that, according to researchers, they do not work as well as we would like (Rosky, 2015, 2013; Rusconi & Mitchener-Nissen, 2013; Meijer & Verschuere, 2010). The public did not pay much attention to this inconvenient fact until the mid-1980s, when the American Psychological Association officially reported that polygraphs were often inaccurate and the U.S. Congress voted to restrict their use in criminal prosecution and employment screening (Krapohl, 2002). Research indicates that 8 out of 100 truths, on average, are called lies in polygraph testing (Grubin, 2010; Raskin & Honts, 2002; MacLaren, 2001). Imagine, then, how many innocent people might be convicted of crimes if polygraph findings were taken as valid evidence in criminal trials.

Given such findings, polygraphs are less trusted and less popular today than they once were. For example, few courts now admit results from such tests as evidence of criminal guilt (Grubin, 2010; Daniels, 2002). Polygraph testing has by no means disappeared, however. The FBI uses it extensively, parole boards and probation offices routinely use it to help decide whether to release convicted offenders, and in public-sector hiring (such as for police officers), the use of polygraph screening may actually be on the increase (Meijer & Verschuere, 2010; Kokish et al., 2005).

image
image
image
image
Traditional scanning The most widely used neuroimaging techniques in clinical practice—the MRI (bottom), CAT, and PET—take pictures of the living brain. Here, an MRI scan (above left) reveals a large tumor, colored in orange; a CAT scan (above center) reveals a mass of blood within the brain; and a PET scan (above right) shows which areas of the brain are active (those colored in red, orange, and yellow) when an individual is being stimulated.

88

Neurological and Neuropsychological Tests Some problems in personality or behavior are caused primarily by damage to the brain or by changes in brain activity. Head injuries, brain tumors, brain malfunctions, alcoholism, infections, and other disorders can all cause such impairment. If a psychological dysfunction is to be treated effectively, it is important to know whether its primary cause is a physical abnormality in the brain.

neurological test A test that directly measures brain structure or activity.

image
The EEG Electrodes pasted to the scalp help measure the brain waves of this baby boy.

A number of techniques may help pinpoint brain abnormalities. Some procedures, such as brain surgery, biopsy, and X ray, have been used for many years. More recently, scientists have developed a number of neurological tests, which are designed to measure brain structure and activity directly. One neurological test is the electroencephalogram (EEG), which records brain waves, the electrical activity that takes place within the brain as a result of neurons firing. In an EEG, electrodes placed on the scalp send brain-wave impulses to a machine that records them.

neuroimaging techniques Neurological tests that provide images of brain structure or activity, such as CT scans, PET scans, and MRIs. Also called brain scans.

Other neurological tests actually take “pictures” of brain structure or brain activity. These tests, called neuroimaging techniques, or brain scanning, include computerized axial tomography (CAT scan or CT scan), in which X rays of the brain’s structure are taken at different angles and combined; positron emission tomography (PET scan), a computer-produced motion picture of chemical activity throughout the brain; and magnetic resonance imaging (MRI), a procedure that uses the magnetic property of certain hydrogen atoms in the brain to create a detailed picture of the brain’s structure.

One version of the MRI, functional magnetic resonance imaging (fMRI), converts MRI pictures of brain structures into detailed pictures of neuron activity, thus offering a picture of the functioning brain. Partly because fMRI-produced images of brain functioning are so much clearer than PET scan images, the fMRI has produced enormous enthusiasm among brain researchers since it was first developed in 1990.

neuropsychological test A test that detects brain impairment by measuring a person’s cognitive, perceptual, and motor performances.

Though widely used, these techniques are sometimes unable to detect subtle brain abnormalities. Clinicians have therefore developed less direct but sometimes more revealing neuropsychological tests that measure cognitive, perceptual, and motor performances on certain tasks; clinicians interpret abnormal performances as an indicator of underlying brain problems (Hogan, 2014). Brain damage is especially likely to affect visual perception, memory, and visual-motor coordination, so neuropsychological tests focus particularly on these areas. The famous Bender Visual-Motor Gestalt Test, for example, consists of nine cards, each displaying a simple geometrical design. Patients look at the designs one at a time and copy each one on a piece of paper. Later they try to redraw the designs from memory. Notable errors in accuracy by individuals older than 12 are thought to reflect organic brain impairment. Clinicians often use a battery, or series, of neuropsychological tests, each targeting a specific skill area (Flanagan et al., 2013; Reitan & Wolfson, 2005, 1996).

89

intelligence test A test designed to measure a person’s intellectual ability.

intelligence quotient (IQ) An overall score derived from intelligence tests.

Intelligence Tests An early definition of intelligence described it as “the capacity to judge well, to reason well, and to comprehend well” (Binet & Simon, 1916, p. 192). Because intelligence is an inferred quality rather than a specific physical process, it can be measured only indirectly. In 1905, French psychologist Alfred Binet and his associate Théodore Simon produced an intelligence test consisting of a series of tasks requiring people to use various verbal and nonverbal skills. The general score derived from this and later intelligence tests is termed an intelligence quotient (IQ). There are now more than 100 intelligence tests available. As you will see in Chapter 14, intelligence tests play a key role in the diagnosis of intellectual disability (mental retardation), and they can also help clinicians identify other problems (Hogan, 2014; Mishak, 2014).

How might IQ scores be misused by school officials, parents, or other individuals? Why is society preoccupied with these scores?

Intelligence tests are among the most carefully produced of all clinical tests (Bowden et al., 2011). Because they have been standardized on large groups of people, clinicians have a good idea how each individual’s score compares with the performance of the population at large. These tests have also shown very high reliability: people who repeat the same IQ test years later receive approximately the same score. Finally, the major IQ tests appear to have fairly high validity: children’s IQ scores often correlate with their performance in school, for example.

Nevertheless, intelligence tests have some key shortcomings. Factors that have nothing to do with intelligence, such as low motivation or high anxiety, can greatly influence test performance (Chaudhry & Ready, 2012) (see MediaSpeak below). In addition, IQ tests may contain cultural biases in their language or tasks that place people of one background at an advantage over those of another (Goldfinger & Pomerantz, 2014). Similarly, members of some minority groups may have little experience with this kind of test, or they may be uncomfortable with test examiners of a majority ethnic background. Either way, their performances may suffer.

Clinical Observations

In addition to interviewing and testing people, clinicians may systematically observe their behavior. In one technique, called naturalistic observation, clinicians observe clients in their everyday environments. In another, analog observation, they observe them in an artificial setting, such as a clinical office or laboratory. Finally, in self-monitoring, clients are instructed to observe themselves.

MediaSpeak

90

Intelligence Tests Too? eBay and the Public Good

Michelle Roberts, Associated Press

image
The Wechsler Adult Intelligence Scale-Revised (WAIS-R) This widely used intelligence test has 11 subtests, which cover such areas as factual information, memory, vocabulary, arithmetic, design, and eye–hand coordination.

Intelligence tests ….re for sale on eBay Inc.’s online auction site, and the test maker is worried they will be misused.

The series of Wechsler intelligence tests, made by San Antonio-based Harcourt Assessment, Inc., are supposed to be sold to and administered by only clinical psychologists and trained professionals.

Given more than a million times a year nationwide, according to Harcourt, the intelligence tests often are among numerous tests ordered by prosecutors and defense attorneys to determine the mental competence of criminal defendants. A low IQ, for example, can be used to argue leniency in sentencing.

Schools use the tests to determine whether to place a student in a special program, whether for gifted or struggling students. Harcourt officials say they fear the tests for sale on eBay will be misused for coaching by lawyers or parents.

When free enterprise principles conflict with psychological well-being, how should the matter be resolved?

But eBay has denied their request to restrict the sale of the tests. eBay officials say there is nothing illegal about selling the tests, and it cannot monitor every possible misuse of items sold through its network of 248 million buyers and sellers. [The tests continue to be available on eBay as of 2015.] Five of the tests were listed for sale ….or about $175 to $900. The latest edition of the adult test, which retails for $939, was offered on eBay for $249.99.

“In order for it to maintain its integrity, there needs to be limited availability,” said [a] Harcourt spokesman…. “Misinterpreting the results [of questions and tasks on the tests], even without malicious intent, could lead to mistakes in assessing a child’s intelligence….”

IQ Tests for Sale on eBay by Michelle Roberts, The Associated Press, 12/18/2007. Used with permission of The Associated Press Copyright © 2014. All rights reserved.

Naturalistic and Analog Observations Naturalistic clinical observations usually take place in homes, schools, institutions such as hospitals and prisons, or community settings. Most of them focus on parent–child, sibling–sibling, or teacher–child interactions and on fearful, aggressive, or disruptive behavior (Hughes et al., 2014; Lindhiem et al., 2011). Often such observations are made by participant observers—key people in the client’s environment—and reported to the clinician.

When naturalistic observations are not practical, clinicians may resort to analog observations, often aided by special equipment such as a video camera or one-way mirror. Analog observations often have focused on children interacting with their parents, married couples attempting to settle a disagreement, speech-anxious people giving a speech, and fearful people approaching an object they find frightening.

Although much can be learned from actually witnessing behavior, clinical observations have certain disadvantages. For one thing, they are not always reliable. It is possible for various clinicians who observe the same person to focus on different aspects of behavior, assess the person differently, and arrive at different conclusions (Meersand, 2011). Careful training of observers and the use of observer checklists can help reduce this problem.

91

Similarly, observers may make errors that affect the validity, or accuracy, of their observations (Wilson et al., 2010). The observer may suffer from overload and be unable to see or record all of the important behaviors and events. Or the observer may experience observer drift, a steady decline in accuracy as a result of fatigue or of a gradual unintentional change in the standards used when an observation continues for a long period of time. Another possible problem is observer bias—the observer’s judgments may be influenced by information and expectations he or she already has about the person (Hróbjartsson et al., 2014).

image
An ideal observation Using a one-way mirror, a clinical observer is able to view a mother interacting with her child without distracting the duo or influencing their behaviors.

A client’s reactivity may also limit the validity of clinical observations; that is, his or her behavior may be affected by the very presence of the observer (Antal et al., 2015). If schoolchildren are aware that someone special is watching them, for example, they may change their usual classroom behavior, perhaps in the hope of creating a good impression (Lane et al., 2011).

Finally, clinical observations may lack cross-situational validity. A child who behaves aggressively in school is not necessarily aggressive at home or with friends after school. Because behavior is often specific to particular situations, observations in one setting cannot always be applied to other settings (Kagan, 2007).

Self-Monitoring As you saw earlier, personality and response inventories are tests in which individuals report their own behaviors, feelings, or cognitions. In a related assessment procedure, self-monitoring, people observe themselves and carefully record the frequency of certain behaviors, feelings, or thoughts as they occur over time (Newcomb & Mustanski, 2014; Huh et al., 2013). How frequently, for instance, does a drug user have an urge for drugs or a headache sufferer have a headache? Self-monitoring is especially useful in assessing behavior that occurs so infrequently that it is unlikely to be seen during other kinds of observations. It is also useful for behaviors that occur so frequently that any other method of observing them in detail would be impossible—for example, smoking, drinking, or other drug use. Finally, self-monitoring may be the only way to observe and measure private thoughts or perceptions.

Like all other clinical assessment procedures, however, self-monitoring has drawbacks (Huh et al., 2013). Here too validity is often a problem. People do not always manage or try to record their observations accurately. Furthermore, when people monitor themselves, they may change their behaviors unintentionally. Smokers, for example, often smoke fewer cigarettes than usual when they are monitoring themselves, and teachers give more positive and fewer negative comments to their students.

Summing Up

CLINICAL ASSESSMENT Clinical practitioners are interested primarily in gathering individual information about each client. They seek an understanding of the specific nature and origins of a client’s problems through clinical assessment.

BETWEEN THE LINES

In Their Words

“You can observe a lot just by watching.”

Yogi Berra

To be useful, assessment tools must be standardized, reliable, and valid. Most clinical assessment methods fall into three general categories: clinical interviews, tests, and observations. A clinical interview may be either unstructured or structured. Types of clinical tests include projective, personality, response, psychophysiological, neurological, neuropsychological, and intelligence tests. Types of observation include naturalistic observation, analog observation, and self-monitoring.