Assessing relative efficacy of antidepressants
Clinical bottom line
Analysis of large numbers of trials in depression show that some drugs (sertraline, escitalopram, for example) do better than others in terms of both reducing depression scores in more people, and being more acceptable, so that more people are able to take the drugs.
These properties probably go together. While this is far from the whole story, there is important information to begin constructing useful care pathways designed to treat more people effectively at lowest cost and least hassle for all.
Reference
A Cipriani et al. Comparative efficacy and acceptability of 12 new-generation antidepressants: a multiple-treatments meta-analysis. Lancet 2009 373: 748-758.
Be honest. There are times when the eyes glaze over and the brain goes into a dreamlike trance when statistics and probabilities are thrown around, especially when devoid of any apparent link to reality. Statistics does that to most people, apart, that is, to some statisticians and a small number of pointy-headed academics.
The rest of us feel a need to be grounded, to have some grasp, however tenuous, of what the numbers mean, and how they affect us or other people. It's why we get cross with media headlines about a doubled risk of some incredibly rare event.
Even so, there are times when something comes along that makes us stop and think, and to grapple with the arcane world of statistics and meta-analysis. This meta-analysis on antidepressants is one such, perhaps because it might be something of a watershed, not because of the statistics, but the thinking behind it.
What is efficacy?
Let's start with something comparatively easy. What does efficacy mean? Now there are lots of different definitions, but let's keep this simple. To most of us simple folk, there are three questions we want answered, and we don't really care what they are called. Bandolier thinks these three questions worth asking about the "efficacy" of any intervention are:
- Does it work? In most, but not all, cases, this implies doing better with the intervention than with an inactive intervention like placebo. Statistics can be useful here, things like relative risk, and p values.
- How well does it work? After all, it's not much good if something works a very, very, little bit. Ideally we want the intervention to work really well. Here we might want an NNT.
- How well does it work compared with other interventions we have for this condition? Here we might compare NNTs for efficacy outcomes (in league tables on some occasions), but realise quite quickly that there are other issues to consider, like adverse events, and whether patients will accept it, and the cost, and so on. At various times ratios of NNT to NNH has been suggested, but in truth there hasn't seemed to be any approach with general applicability. Here we being to move from efficacy (does it work) to effectiveness (how well does it work in practice).
That is where the multiple-treatment meta-analysis comes in, and can be helpful. Bandolier thinks there is a way of making the approach easier, and do-able on the back of an envelope, but first, a brief description on what was going on.
Background
Drug treatment of depression involves frequent switching to find a drug that works well for that particular patient, because of the usual problems of lack of efficacy or adverse events. Bandolier covered a terrific RCT that looked at just this issue (Bandolier 95-4), in which only 44% of patients started on a drug were on it at the end. The trial showed that having three SSRIs was much better than one. An accompanying editorial made the point that while the three SSRIs were equal on average in clinical trials, they were not equal for every individual patient.
Inevitably, we need more than one antidepressant. The question isn't whether they work or how well they work, but which of them works best, and how might we choose to use them in a sequential treatment cascade to get the best results for patients, both individually, and for the whole population with depression.
The multiple treatments meta-analysis set out to ask whether there any of 12 new generation antidepressants were noticeably better than the others.
Methods
The data set was 117 randomised trials comparing one antidepressant with another; placebo-only controls were not used. Trials lasted 6-12 weeks. Doses of drugs were set as low, medium, or high, depending on pre-set criteria.
Two outcomes were used. The first, and efficacy outcome, was at least 50% reduction in a recognised depression score, or clinical global impression of much or very much improved, at eight weeks. The second, acceptability, outcome was all-cause withdrawals at eight weeks.
Some sophisticated statistics were then done, both on pairwise analyses and then on all the data comparing direct and indirect comparisons, and did sensitivity analyses on does within the therapeutic range, and on methodological issues.
Results
Results were expressed as the probability of any of the 12 drugs being among the top four for both efficacy and acceptability. Figure 1 shows the cumulative probability for both criteria as a percentage - with higher percentages obviously better.
Figure 1: Probability of being among top four drugs for efficacy (at least 50% reduction in depression score) and acceptability (all cause withdrawal) at mean of eight weeks of treatment
Some drugs (sertraline, escitalopram) do well on both counts, while others (citalopram, mirtazapine, venlafaxine) do well in one but not the other. Some (paroxetine, reboxetine) have a low probability of being in the top four on either criterion.
Issues of dose and method made no difference to the overall results in sensitivity analyses, and direct and indirect analyses gave different results no more than may be expected by chance.
Comment
Terrific stuff. Knowing that several antidepressants perform generally better than others is useful, and we may conclude that those at the top of the ladder might come earlier any care pathway of treatment strategy, but that doesn't mean that the others are without effect.
Bandolier has tried a slightly different approach using the data from the paper. Figure 2 shows the numbers of patients with efficacy and acceptability criteria, the total number, and the percentage with each outcome. These were then simply divided into those with the best (shaded blue) and worse (shaded pink) performance for each outcome, with the others shaded yellow. There is broad agreement with the statistical approach. Drugs doing best with efficacy generally also did well for acceptability, while those doing worse for efficacy generally did worse on acceptability.
Figure 2: Simplified assessment of efficacy and acceptability using simple percentages for efficacy (at least 50% reduction in depression score) and acceptability (all cause withdrawal) at mean of eight weeks of treatment, together with some information on cost
In addition, simple cost information is provided, based on approximate cost for a month of treatment in the UK, using BNF costs for medium doses. Generally, those drugs doing better on efficacy and acceptability had lower costs.
The implication is again that in creating care pathways it would be better to use the drugs at the top of the table first. Note that two of the 12 drugs (bupropion, milnacipran) do not have a UK license for depression at the time of writing.
Objections
Not everyone likes the meta-analysis, and MeReC [1] took issue with it on a number of points. It is useful to question them.
- Most studies were done by pharmaceutical companies. That of course is true, and large independent trials are perhaps to be desired. But the fact is that, in the world in which we live, most trials have commercial interests. The development of rigourous criteria for design, reporting, conduct, and monitoring of trials has been instituted to prevent commercial and other biases affecting results. Ask the question from another angle: where is the convincing evidence that these trials are wrong? They have been accepted by regulatory agencies like FDA and EMCA as being adequate, on the basis of much greater detail than is presented in published papers.
- Discrepancies existed between indirect and direct comparisons. This is directly answered in the paper, where 6 out of 133 comparisons were different, exactly the expected number by chance alone. Put the other way, 127 out of 133 direct and indirect comparisons gave the same result.
- Studies were poor quality. The description of treatment allocation was unclear in most trials (105/117 trials), as it is in 90% of trials. There are two Cochrane reviews just published on escitalopram and sertraline [2,3]. These show that all trials were described as both randomised and double blind, the areas most likely associated with bias. As the meta-analysis itself discusses, this is usually an issue of reporting in journals with tight word limits rather than an issue of conduct. The problem with using treatment allocation concealment as the main or only criterion means that unclear is the best you can get.
- Mean sample size was small. The mean sample size was 110 participants per group (range 9-357). Bandolier is also concerned about small studies, and prefers omitting trials of small size. But one of the reasons we do meta-analyses is to overcome the problem of size. A quick look at Figure 2 shows that for most of the drugs there were impressively large numbers, and in total about 26,000 patients were involved. None of the drugs favoured had fewer than 1,000 patients treated.
- The mean duration was only 6 weeks, and trials were all 6-12 weeks in duration. It is useful to question trial duration when the use of an intervention is longer. Firstly, any shorter trials have been omitted. In the absence of substantial evidence from longer trials, this is the best we have. In what amounted to a real world primary care experiment, only 44% of patients were still taking the treatment to which they had been randomised by nine months. Others either switched to another antidepressant or stopped treatment because of adverse effects or lack of efficacy. There is an argument that 6-12 weeks is the window in which issues of lack of efficacy, adverse events, and switching take place, making it the ideal period trial duration on which to base decisions.
- The clinical significance of the dichotomous measure of efficacy is unclear. This is a very old-fashioned argument. Using mean data is hopeless for all sorts of reasons, some of which are rehearsed below. Similar dichotomous outcomes are now becoming widely used in other areas, and are proving very useful. This is no more than a quibble, though that does not mean that better dichotomous outcomes won't be developed.
- No adjustments were made for multiple statistical testing. Another useful point. On the other hand, the simplistic approach outlined in Figure 2 produces much the same result.
Further comment
This approach has application way beyond just depression. This has every prospect of being a useful methodological simplification and advance that could be used much more widely. The use of a dichotomous measure of benefit, set at a high level, is in accord with developing thinking in a number of fields. Combining this with a measure of how many people can take the drug hits directly at effectiveness - because there's no benefit at all when people can't take it.
Changes in average scores usually reflect the experience of very few patients. Moreover, it is common to use last observation carried forward, meaning that people who discontinue can still contribute to efficacy measures, even when there can be none because they have stopped taking it.
There is one other comment. The authors suggest that their results make sertraline the base case - raising the base well above placebo. They raise the question whether sertraline should be the new placebo, or at least the common comparator for all future depression trials.
And finally, the main point in all of this is that of getting the bestest for the mostest with the leastest. What the meta-analysis provides is the raw material for the next step, namely creating and testing a care pathway or pathways for depression that provides good results for the largest number of sufferers in the shortest time and at the lowest cost.
Other references:
- MeReC Monthly No 13. April 2009.
- Cipriani et al. Escitalopram versus other antidepressive agents for depression. Cochrane Database of Systematic Reviews 2009, Issue 2. Art. No.: CD006532.
- Cipriani et al. Sertraline versus other antidepressive agents for depression. Cochrane Database of Systematic Reviews 2009, Issue 2. Art. No.: CD006117.