Problems with the MMPI-2-RF
Problems with using the MMPI–2–RF in forensic evaluations: A clarification to Ellis.
Butcher, James N., Williams, Carolyn L.
Journal of Child Custody: Research, Issues, and Practices, Vol 9(4), Oct, 2012. pp. 217-222.
Abstract:
Toward the end of her article addressing the prevalence of MMPI-2 Scale 4 elevations among child custody litigants, Ellis recommended use of the MMPI-2-RFscales in child custody evaluations. Unfortunately, this recommendation was not supported by a comprehensive review of the literature similar to the one she used to address her primary questions about MMPI-2 Scale 4. This article describes many of the criticisms of the MMPI–2–RF, and its various scales, including the RC Scales and the Fake Bad Scale (recently renamed the Symptom Validity Scale). The MMPI-2-RF and its scales are novel measures whose use may be challenged inforensic settings. Psychologists considering including them as part of a test battery must evaluate their psychometric properties by carefully examining the test manuals, empirical studies, and recent textbooks. Many experts do not consider the MMPI-2-RF to be a viable replacement for the MMPI-2 in critical evaluations such as child custody.
Text from the article:
The MMPI–2–RF has not gained acceptance as a replacement for the MMPI–2. Several recent textbooks challenge the notion that the MMPI–2–RF is the instrument of choice for forensic evaluations (Butcher, 2011; Graham, 2012; Greene, 2011; Nichols, 2011). Consider, for example, the opinion of one of the coauthors of the RC scales, Graham:
In settings where the goal of assessment is a comprehensive understanding of test takers, this author would choose the MMPI–2 because it is his opinion that interpretations based on the MMPI–2 can yield a more in-depth analysis of personality and psychopathology. In forensic settings, the acceptance of a relatively new assessment instrument is predicated on accumulated research supporting the use of the instrument and the extent to which the instrument is accepted by the professional community as a whole. Consequently, the professional who uses the MMPI–2–RF in forensic settings should be prepared to address challenges based on the instrument’s novelty. (pp. 414–415)
Greene (2011) listed three disadvantages of using the MMPI–2–RF instead of the MMPI–2:
First, the absence of the MMPI–2 clinical scales from the MMPI–2–RF makes it impossible to utilize code type interpretation that has been the core of the MMPI/MMPI–2 interpretation for over 50 years. Second, none of the MMPI–2 content and supplementary scales can be scored on the MMPI–2–RF, and so all of this research and clinical usage also is lost. Third, the “MMPI–2” in the MMPI–2–RF is a misnomer because the only relationship to the MMPI–2 is its use of a subset of the MMPI–2 item pool, its normative group, and similar validity scales. The MMPI–2–RF should not be conceptualized as a revised or restructured form of the MMPI–2, but as a new self-report inventory that chose to select its items from the MMPI–2 item pool and to use its normative group. (p. 22)
Nichols (2011) raised many similar concerns as the other MMPI–2 text- book authors (i.e., Butcher, 2011; Graham, 2012; Greene, 2011) about the MMPI–2–RF. He described questions about the adequacy of the theory and methodology used to develop the RC scales. He raised questions about the psychometric characteristics of many of the new MMPI–2–RF scales (e.g., low internal consistency, confusing, or misleading empirical correlates that raise concerns about construct validity).
Problems with possible gender bias resulting from the decision to use nongendered norms instead of specific norms for men and women are another problem he described with selecting the MMPI–2–RF instead of the MMPI–2. The research literature shows gender differences in some mental health symptoms as well as in personality traits or characteristics (Mason, Bubany, & Butcher, 2012). Because of gender differences in personality (Cattell, 1946; Hathaway & McKinley, 1940), most personality scales use gender-based norms to provide gender sensitive measurement (e.g., the MMPI–2, the 16 Personality Factor Questionnaire [16PF], Millon Multiaxial Clinical Inventory–III [MCMI–III], the Neuroticism-Extroversion-Openness Inventory [NEO–PI]).
Child custody evaluators must have an understanding of the underlying research literature about any new psychological test before adopting it in their clinical practices. The MMPI–2–RF has little relationship with its name- sake. Forty percent of the MMPI–2 items were dropped from the MMPI–2 to form the MMPI–2–RF booklet. Many of these dropped items address personality problems and mental health symptoms that are important in forensic evaluations such as child custody: 21 items related to antisocial attitudes, 21 items dealing with work functioning, 15 items assessing family problems, and 11 items dealing with negative life events (Butcher, 2011). Thus, the test coverage in the MMPI–2 is not found in the MMPI–2–RF. As Gass (2009) pointed out:
The elimination of over 200 MMPI–2 items that are “working items” has additional implications for information loss and its potentially adverse impact on clinical use of the MMPI- 2 … It is clear, however, that if clinicians abandon the original Clinical Scales and body of code-type information, they will sacrifice the most impressive body of empirically based interpretive material ever amassed in the history of personality assessment. (p. 442)
Two thirds of the 50 scales on the MMPI–2–RF are new measures introduced for the first time by the test developers and publisher (Ben-Porath & Tellegen, 2008; Tellegen & Ben-Porath, 2008). The majority of the scales incorporated in the MMPI–2–RF are insufficiently validated to provide the practitioner with confidence in assessment. The well-established MMPI–2 clinical scales were replaced with the controversial RC scales on the MMPI–2–RF. The MMPI–2 and MMPI–2–RF are not psychometrically equivalent – indeed, an individual’s item responses to the two measures can result in very different clinical pictures.
Rouse, Greene, Butcher, Nichols, and Williams (2008) conducted an extensive study of 25 research samples from a diverse range of settings and included MMPI–2 responses from 78,159 individuals (among the eight forensic samples were three child custody samples). This study showed that each RC scale was substantially correlated with an existing MMPI–2 supplementary, content, or PSY–5 scale. Furthermore, the RC scales were generally less reliable than their comparable extant MMPI–2 scales. RC 4 was more closely associated with the Addiction Admission Scale (AAS) than any of the other anti-social measures on the MMPI–2. In the child custody samples, the correlation of RC 4 with AAS was .79 to .80. In contrast, Tellegen et al. (2003) reported correlations between RC 4 and its parent Scale 4 from .62 to .66.
The RC scales of the MMPI–2–RF have been shown to have extremely different profiles than the MMPI–2 clinical scales, with the RC scales under predicting psychopathology. Kauffman (2011) found that mean T scores for the RC scales demonstrate lower elevations than have been shown by the MMPI–2 clinical scales in samples of child custody litigants. Saborío and Hass (2012), with a large sample of women who had been sexually assaulted, found that the RC scales did not detect psychopathology as the MMPI–2 clinical scales did. Khouri (2011) found the RC scales did not detect depres- sion among Latino clients. Pizitz and McCullaugh (2011) found in a study of convicted stalkers that the RC scales were insensitive to psychopathology, failing to alert evaluators to problems.
The MMPI–2–RF relies upon the controversial Fake Bad Scale (subsequently renamed the Symptom Validity Scale) as a key measure in self-report assessment. Thus, the suggestion of a “malingering” response style is more likely than if the practitioner relies on the traditional MMPI–2 infrequency measures (see discussions by Butcher, Arbisi, Atlis, & McNulty, 2008; Butcher, Gass, Cumella, Kally, & Williams, 2008; Gass, Williams, Cumella, Butcher & Kally, 2010). Forensic psychologists need to be aware that the FBS has been excluded from testimony in Frye hearings for several court cases because of its potential for bias by labeling people with genuine problems, women especially, as “malingering” or “faking” (for examples, see Davidson v. Strawberry Petroleum et al., 2007; Stith v. State Farm Mutual Insurance, 2008; Vandergracht v. Progressive Express et al., 2007; Williams v. CSX Transportation, Inc., 2007). (emphasis added)
A number of the new scales on the MMPI–2–RF, as acknowledged by Tellegen and Ben-Porath (2008), show very low reliability coefficients for personality measures perhaps, in part, because of their scale length (e.g., 4 to 6 items). For example, the reliability coefficient for the Helplessness or HLP scale (5 items) was only .39 for men and .50 for women in the normative sample; the Behavior-Restricting Fears or BRF scale (9 items) had reliability coefficients of only .44 for men and .49 for women, and the scale Suicidal/Death Ideation or SUI (5 items) had correlations of only .41 for men .34 for women (Tellegen & Ben-Porath, 2008).
Ellis (2012) described four subscales for RC 4: JCP (Juvenile Conduct Problems), SUB (Substance Abuse), AGG (Aggression), and ACT (Activation). She suggests that AGG and ACT may be good predictors of violent behavior. An MMPI–2 subscale is a content-based subset of items from a scale. A subscale assists in the interpretation of its parent scale; it is not a stand-alone measure. Neither the RC monograph nor the MMPI–2–RF manual lists any subscales for the RC scales, including RC 4. The scales highlighted in Ellis’s article are among the new scales introduced for the first time by Ben-Porath and Tellegen (2008) and Tellegen and Ben-Porath (2008). These new measures have relatively modest Alpha coefficients: JCP (M = .65, F = .56), SUB (M = .62, F = .62), AGG (M = .66, F = .58), and ACT (M = .60, F = .60), and thus are relatively low in reliability for forensic decisions. We would not recommend making any predictions about an individual’s propensity to violence based on the available psychometric information about AGG and ACT.
We hope that we have provided enough cautionary information for the wise child custody evaluator to consider. Decisions about which instrument to include in their evaluations have to be based upon a careful examination of the psychometric properties of the scales. At the minimum, psychologists must carefully examine the written test manuals and peer-reviewed literature with a critical mind and not just rely on promises of advancements. Ellis (2012) provided a critical review of the MMPI–2 literature on Scale 4, but fell short with her uncritical acclaim for the MMPI–2–RF. Perhaps our most important take-away message is that the extensive body of research Ellis reviewed to answer her questions about the use of Scale 4 in child custody evaluations is not pertinent to users of the MMPI–2–RF, and there is no equivalent research-based evidence to recommend this new test as a substitute for the MMPI–2.