Two Fallacies Invalidate the DSM-5 Field Trials

Author(s):

The designer of the DSM-5 Field Trials has just written a telling commentary in the American Journal of Psychiatry. She makes what I consider to be 2 basic errors that reveal the fundamental worthlessness of these Field Trials and their inability to provide any information that will be useful for DSM-5 decision making.

The designer of the DSM-5 Field Trials has just written a telling commentary in the American Journal of Psychiatry (AJP). She makes what I consider to be 2 basic errors that reveal the fundamental worthlessness of these Field Trials and their inability to provide any information that will be useful for DSM-5 decision making.

1. The commentary states: “A realistic goal is a kappa between 0.4 and 0.6, while a kappa between 0.2 and 0.4 would be acceptable.” This is incorrect and flies in the face of all traditional standards of what is considered "acceptable" diagnostic agreement among clinicians. Clearly, the commentary is attempting to greatly lower our expectations about the levels of reliability that were achieved in the field trials-- to soften us up to the likely bad news that the DSM-5 proposals are unreliable. Unable to clear the historic bar of reasonable reliability, it appears that DSM-5 is choosing to drastically lower that bar-- what was previously seen as clearly unacceptable is now being accepted.

Kappa is a statistic that measures agreement among raters, corrected for chance agreement. Historically, kappas above 0.8 are considered good, above 0.6 fair, and under 0.6 poor. Before this AJP commentary, no one has ever felt comfortable endorsing kappas so low as 0.2-0.4. As a comparison, the personality section in DSM-III was widely derided when its kappas were around 0.5. A kappa between 0.2-0.4 comes dangerously close to no agreement. "Accepting" such low levels is a blatant fudge factor. Lowering standards in this drastic way cheapens the currency of diagnosis and defeats the whole purpose of providing diagnostic criteria.

Why does this matter? Good reliability does not guarantee validity or utility -- human beings often agree very well on things that are dead wrong. But poor reliability is a certain sign of very deep trouble. If mental health clinicians cannot agree on a diagnosis, it is essentially worthless. The low reliability of DSM-5 presaged in the AJP commentary confirms fears that its criteria sets are so ambiguously written and difficult to interpret that they will be a serious obstacle to clinical practice and research. We will be returning to the wild west of idiosyncratic diagnostic practice that was the bane of psychiatry before DSM-III.

2. The commentary also states: “one contentious issue is whether it is important that the prevalence for diagnoses based on proposed criteria for DSM-5 match the prevalence for the corresponding DSM-IV diagnoses” .... “to require that the prevalence remain unchanged is to require that any existing difference between true and DSM-IV prevalence be reproduced in DSM-5. Any effort to improve the sensitivity of DSM-IV criteria will result in higher prevalence rates, and any effort to improve the specificity of DSM-IV criteria will result in lower prevalence rates. Thus, there are no specific expectations about the prevalence of disorders in DSM-5.”

This is also a fudge. For completely unexplained and puzzling reasons, the DSM-5 field trials failed to measure the impact of its proposals on rates of disorder. These quotes in the commentary are an attempt to justify this fatal flaw in design. The contention is that we have no way of knowing what true rates of a given diagnosis should be-- so why bother to measure what the likely impact will be on rates of the DSM-5 proposals? If rates double under DSM-5, the assumption will be that it is picking up previous false negatives with no need to worry about the risks of creating an army of new false positives.

This is irresponsible for 2 reasons. First, we are already suffering from serious diagnostic inflation. Rates of psychiatric disorder are already sky high (25% in the general population in any year; 50% lifetime) and we recently have experienced 3 runaway false epidemics of childhood disorders in the past 15 years. Second, drug company marketing has been so abusive as to warrant enormous fines and so successful as to result in widespread misuse of medication for very questionable indications. Recent CDC data suggest that the severely ill remain very undertreated, but that the mildly ill or not ill at all have become massively overtreated, especially by primary care physicians.

The DSM-5 proposals will uniformly increase rates, sometimes dramatically. Not to have measured by how much is unfathomable and irresponsible. The new diagnoses suggested for DSM-5 will (mis)label people at the very populous boundary with normality. Mixed anxiety depression and binge eating disorder will likely have astounding high rates between 5% and 10%. . . that's tens of millions people now considered "normal" suddenly converted into mentally ill by arbitrary DSM-5 fiat. Psychosis risk and disruptive mood disorder will be extremely common in the young; minor neurocognitive among the elderly. Legions of the recently bereaved will be misdiagnosed as clinically depressed; rates of generalized anxiety and addiction will mushroom; and ADD (which has already almost tripled) will find even more room at the top. The field trial developers seem either unaware or insensitive to the unacceptable risks involved in creating large numbers of false positive, pseudo-patients.

Indeed, quite contrary to the blithe assertions put forward in the commentary, we should have rigorous expectations about prevalence changes triggered by any DSM revision. Rates should not be wildly different for the same disorder UNLESS there is clear evidence of a serious false negative problem and firm protections against creating a massive false positive problem. And new disorders with high prevalences should not be included without substantial scientific evidence and convincing proof of accuracy, reliability, and safety. We have known since they were first posted that none of the DSM-5 proposals comes remotely close to meeting a minimal standard for accuracy and safety. And now, the AJP commentary seems to be softening us up for the bad news that their reliability is also lousy.

The workers on DSM-5 ignore the often dire implications of drastically raising the prevalence of an existing disorder or adding an untested new disorder with high prevalence-- ie, the misguided and potentially harmful treatment, the unnecessary stigma, and rising health care costs that also cause a misallocation of very scarce resources. Just 2 examples. Do we really want even more antipsychotic medications prescribed for children, the elderly, and returning war veterans when these are already being used so loosely and inappropriately? Isn't the current legal and illegal overuse of stimulant medications already a big enough problem without introducing a drastically lowered set of criteria for diagnosing ADD? Sad to say, DSM-5 has failed to do an adequate risk/benefit analysis on any of its suggestions. Every one of its changes is designed to chase elusive false negatives; none protects the interests of mislabeled false positives.

Given our country's current binge of loose diagnostic and medication practice (particularly by the primary care physicians who do most of the prescribing), DSM-5 should not be in the business of casually raising rates and offering inviting new targets for aggressive drug marketing. Instead, DSM-5 should be working in the opposite direction-- taking steps to increase the precision and specificity of its diagnostic criteria. And the texts describing each disorder should contain a new section warning about the risks of overdiagnosis and ways of avoiding it. It is impossible to say what is the “right” prevalence of any disorder, but it is careless and reckless to so dramatically increase the prevalences of mental disorders without evidence of need or proof of safety.

The DSM-5 field trials have cost APA at least $3 million (perhaps a whole lot more). They started off on the wrong foot by asking the wrong question- focusing only on reliability and completely ignoring prevalence. The deadlines for starting the trials and for delivering results have been repeatedly postponed because of poor planning, an excessively cumbersome design, and disorganized implementation. The results will be arriving at the very last minute when decisions should have already be made. And now we get a broad hint that the reliabilities, when they are finally reported, will be disastrously low.

What should be done now as DSM-5 enters its depressing endgame? There really is no rational choice except to drop the many unsupportable DSM-5 proposals and to dramatically improve the imprecise writing that plagues most of the DSM-5 criteria sets.