COUNTERPOINT: Evaluating EMDR in Treating PTSD - continued
by Shawn P. Cahill, Ph.D.
July 2000, Vol. XVII, Issue 7
Before discussing the results of this study, it may be useful to review the criteria by which multiple-baseline studies across participants are evaluated. More is required to make causal inferences than just showing a change on the dependent variable following the phase shift that is replicated in multiple individuals (Barlow and Hersen, 1984). The multiple-baseline design across participants begins with obtaining concurrent baselines on multiple individuals. Once stable baselines are obtained for all participants, the experimental treatment is introduced to one participant while the baseline conditions are maintained for the remaining participant(s). When the target individual's response during the experimental phase has stabilized, the experimental manipulation is then introduced to the next individual assuming that concurrent response to the baseline condition has remained stable for the other participant(s). This process is repeated until all participants have received the intervention. The intervention's effectiveness is demonstrated when the dependent variable changes with the phase shift for the treated participant, but not the untreated participant(s). This pattern is subsequently replicated across participants.
A logical prerequisite to meeting these conditions is that, for each person, multiple data points within each phase must be available for visual inspection. Unfortunately, this requirement is not met for the majority of measures in the Montgomery and Ayllon (1994) paper. In fact, it is only met for SUD levels. Casual inspection of the relevant graphs suggests that substantial decreases in SUD ratings occurred during the B (no-eye-movement) phase for only one of the six participants (subject 4, and then only in the first B session). By contrast, substantial decreases in SUD scores occurred during the BC (full EMDR) phase for five of the six participants (all but subject 5). These observations may appear to support the hypothesis that eye movements enhanced fear reduction. A more careful inspection of some of the participant pairs, however, cautions against concluding that eye movements were responsible for the decline in SUD ratings during the BC phase.
In the first pair of participants, subject 1 was shifted from the B to BC phase between session 7 and session 8, with little difference between SUD levels on these two days. Although subject 1's SUD ratings subsequently declined over the course of the BC phase, the decline also continued throughout the follow-up phase. In other words, this subject's SUD levels never become stable in the BC phase. Nevertheless, subject 2 was shifted from the B to BC phase between session 8 and session 9. This shift of subject 2 only one session after shifting subject 1 precludes comparing the decline observed in the BC phase for subject 1 with an ongoing B phase for subject 2. Thus, we cannot confidently attribute the decline in SUD ratings for both individuals from session 9 to session 13 to the eye movements, as there is no concurrent no-eye-movement condition against which to compare the eye-movement condition.
With regard to the third pair of subjects, the interpretive problem here is that SUD ratings for one of the two individuals (subject 5) did not show much decline during the BC phase, while the other one (subject 6) did. This lack of consistency across the two participants raises questions as to whether the decline in subject 6's SUDs can actually be attributed to the eye movements.
Thus, there is no solid evidence in the Montgomery and Ayllon (1994) study that meets the criteria for drawing causal inferences from multiple-baseline designs to support conclusions about the importance of eye movements in EMDR. Further, the unavailability of correspondingly fine-grained data for the BDI scores, weekly symptom reports and psychophysiological measures prevents conclusions about whether the observed changes in symptom measures over time can be attributed to any specific component of the intervention.
A recent study by Cusack and Spates (1999) is the only one to investigate the role of "installation" trials-the cognitive restructuring component of EMDR-for PTSR. Participants were randomly assigned to receive either standard EMDR or a condition in which the installation trials were replaced by additional desensitization trials. Both groups improved during the study and retained their improvements at two-month follow-up. There were no differences between groups on any measure. Thus, as with the eye movements, there is no evidence that the other major non-exposure element of EMDR-its unique form of cognitive restructuring-improves treatment outcome.
I have attempted to illustrate that the primary literature on EMDR does not justify claims about its relative efficacy, efficiency and acceptability in comparison to CBT. Nor is there any strong evidence that EMDR achieves its therapeutic effects through different or additional mechanisms than exposure therapy. If the primary literature does not support such claims, then where are they from?
Many are based on authors making informal comparisons across studies (Lipke, 1999; Montgomery and Ayllon, 1994; Pitman et al., 1996). Given the often substantial differences across various studies of EMDR and CBT (e.g., different samples, different measures, single versus multiple therapists, differing duration and number of sessions, different control groups, and so on), such comparisons are fraught with difficulties (Cahill and Frueh, 1997) and do not provide an adequate basis for drawing conclusions about comparisons between treatments.
A second basis for such assertions is the use of meta-analysis, a procedure intended to provide a quantitative method for reviewing and synthesizing the results of research studies. One recent comprehensive meta-analysis of treatments for PTSD concluded:
Behaviour therapy and EMDR were the most effective psychological therapies, and both were as effective as SSRIs [selective serotonin reuptake inhibitors]. Effect sizes were large across all PTSD symptom domains for these treatments in relation to controls, and treatments were generally statistically comparable in efficacy (Van Etten and Taylor, 1998).
The authors further suggested that EMDR is more efficient than other treatments, and that EMDR achieves its therapeutic effect through some mechanism other than exposure.
The studies included in their meta-analysis, however, did not include a single study in which EMDR was directly compared with behavior therapy which, as they defined it, combined studies of PE, SIT and IHT. The Vaughan et al. study (1994) was not included because only 80% of participants met criteria for PTSD (S. Taylor, personal communication, January 1999) and the Devilly and Spence (1999) study had not yet been published. Nor is there a single study in their meta-analysis in which any form of psychotherapy was directly compared with medication.
Conclusions drawn from meta-analysis are heavily dependent on the methods used to identify and select studies, compute the effect sizes, and group the various studies. In order to increase the number of studies included in their meta-analysis and create comparisons across studies that do not exist in the primary literature, Van Etten and Taylor (1998) did not use the standard method for computing effect sizes.
The standard method is to compute a between-group effect size by subtracting the posttreatment mean of the comparison group-of-interest from the corresponding mean of the target treatment group, and then dividing this group difference by the pooled standard deviation (Cohen, 1988). This is done for all comparisons-of-interest in each study to be included in the meta-analysis. The resulting effect sizes from the different studies are then combined according to the types of comparisons of interest.
In contrast, Van Etten and Taylor (1998) computed within-group effect sizes. For each group-of-interest in a study, the posttreatment mean was subtracted from the pretreatment mean and divided by the pooled within-group standard deviation. Their rationale was that this allowed inclusion of uncontrolled studies in their meta-analysis, as a control group is not necessary for computing within-group effect sizes, thereby "increasing the number of trials and statistical power to detect differences between treatments." They subsequently categorized the within-group effect sizes in terms of the type of intervention and compared average effect sizes across categories. It is important to understand that there was no overlap in studies between the 13 effect sizes for behavior therapy and the 11 effect sizes for EMDR in the Van Etten and Taylor (1998) meta-analysis.
There is a serious concern with using within-group effect sizes to create comparisons across groups of studies that do not exist in the primary literature. It ignores that study populations are necessarily nested within their studies and that, in the absence of direct comparisons between therapy types, the different studies are themselves nested within their type of treatment. This confounds study samples with type of treatment, precluding meaningful conclusions about the comparative efficacy of the different treatments.
Consider the comparison of average effect sizes on total self-reported PTSD severity between behavior therapy and EMDR (1.27 and 1.24, respectively). These mean values tell us that, within each set of studies, the average within-group effect sizes for the different treatments were quite similar. They do not say, however, what the average effect size of EMDR would have been in the populations represented in the behavior therapy studies, nor do they specify what the average effect size of behavior therapy would have been in the populations represented in the EMDR studies. Furthermore, there is no basis for assuming that, just because all of the studies in the meta-analysis utilized full PTSD samples, that there would be no differences across the various study samples in such variables as severity, chronicity or motivation for (and responsiveness to) treatment.
The danger of this strategy in Van Etten and Taylor's (1998) meta-analysis may be further illustrated by considering an example in which it yields a conclusion different from the one drawn from the relevant primary literature. The meta-analysis included only one effect size for an EMDR group without eye movements.
They noted, "When all eye movement conditions in the meta-analysis were compared with this one fixed-eye condition, EMDR was more effectiveË˜_However, when the fixed-eye control was compared to the EMDR condition within the same study [i.e., Devilly and Spence, 1996], the fixed-eye condition was comparable to the EMDR condition." (The Devilly and Spence study cited by Van Etten and Taylor was an unpublished manuscript at that time. It was later published as Devilly et al., 1998.)
Furthermore, I have already mentioned numerous dismantling studies not included in Van Etten and Taylor's meta-analysis that concluded that eye movements did not contribute to treatment outcome (Cahill et al., 1999).
In the absence of any convincing evidence from dismantling studies that any of the unique features of EMDR contribute to treatment outcome, the remaining basis for claims that EMDR operates through mechanisms other than (or in addition to) exposure is a logical argument (Shapiro, 1999, 1996; Van Etten and Taylor, 1998). Proponents argue that the amount of exposure in the EMDR protocol is less than in exposure-therapy protocols and is implemented in ways that are less than optimal for exposure therapy (e.g., brief interrupted exposures in EMDR versus long, uninterrupted exposures in PE). Since EMDR achieves the same or better outcome as PE in the same or fewer sessions, exposure alone cannot be the operative mechanism. One hopes it can be seen that this argument rests on an unsubstantiated assumption about the relative efficacy/efficiency of EMDR and PE. As such, the conclusion is uncertain.
Questions remain as to the crucial components of effective treatments and their relative merits. Narrative reviews and meta-analyses are useful means of summarizing accumulated knowledge and generating hypotheses. Logical analyses are also helpful in generating new hypotheses and for guiding new studies. None of these methods, however, replace sound empirical research as the primary basis for the growth of scientific knowledge.
Dr. Cahill is an instructor at the Center for the Treatment and Study of Anxiety in the department of psychiatry at the University of Pennsylvania School of Medicine. He gratefully acknowledges Steven Taylor, Ph.D., for providing information regarding studies excluded from the Van Etten and Taylor (1998) meta-analysis.