There are many traps and pitfalls to negotiate when assessing evidence, and it is all too easy to be misled by an apparently beautiful study that later turns out to be wrong, or by a meta-analysis with impeccable credentials that seems to be trying to pull the wool over our eyes. Although these are themes often found in the pages of Bandolier, a little reinforcement rarely comes amiss.
Law of initial results
So often early promising results are followed by others that are less impressive. It is almost as if there is a law that states that first results are always spectacular, and subsequent ones are mediocre: the law of initial results. It now seems , that there may be some truth in this.
Three major general medical journals (New England Journal of Medicine, JAMA, and Lancet) were searched for studies with more than 1000 citations published between 1990 and 2003. This is an extraordinarily high number of citations when you think that most papers are cited once if at all, and that a citation of more than a few hundred times is as rare as hens' teeth.
Of the 115 articles published, 49 were eligible for the study because they were reports of original clinical research (like tamoxifen for breast cancer prevention, or stent versus balloon angioplasty). Studies had sample sizes as low as 9 (nine) and as high as 87,000. There were two case series and four cohort studies, and 43 randomised trials. The randomised trials were very varied in size, though, from 146 to 29,133 subjects (median 1817 subjects; Figure 1). Fourteen of the 43 randomised trials (33%) had fewer than 1000 patients and 25 (58%) had fewer than 2,500 patients.
Figure 1: Size of highly-cited RCTs
Of the 49 studies, seven were contradicted by later research. These seven contradicted studies included one case series with 9 patients, three cohort studies with 40,000 to 80,000 patients, and three randomised trials, with 200, 875 and 2002 patients respectively. So only three of 43 randomised trials were contradicted (7%), compared with half the case series and 3/4 cohort studies.
A further seven studies found effects stronger than subsequent research. One of these was a cohort study with 800 patients. The other six were randomised trials, four with fewer than 1000 patients and two with about 1500 patients.
Most of the observational studies had been contradicted, or subsequent research had shown substantially smaller effects, but most randomised studies had results that had not been challenged. Of the nine randomised trials that were challenged, six had fewer than 1000 patients, and all had fewer than 2003 patients. Of 23 randomised trials with 2002 patients or fewer, 9 were contradicted or challenged. None of the 20 randomised studies with more than 2003 patients were challenged.
Most published research false?
In the past people have commented that only 1% of articles in scientific journals are scientifically sound . Bandolier has often examined articles showing how we consumers of scientific literature can be misled, and how we often are. Another paper from Greece  is replete with Greek mathematical symbols and philosophy. It makes a number of important points:
- The smaller the studies conducted in a scientific field, the less likely the research findings are to be true.
- The smaller the effect sizes in a scientific field, the less likely the research findings are to be true.
- The greater the number and the fewer the selection of tested relationships in a scientific field, the less likely the research findings are to be true.
- The greater the flexibility in designs, definitions, outcomes, and analytical modes in a scientific field, the less likely the research findings are to be true.
- The greater the financial and other interests and prejudices in a scientific field, the less likely the research findings are to be true.
- The hotter a scientific field (the more scientific teams involved), the less likely the research findings are to be true.
Ioannides  then performs a pile of calculations and simulations but then demonstrates the likelihood of us getting at the truth from different typical study types (Table 1). This ranges from odds of 2:1 on (67% likely to be true) from a systematic review of good quality randomised trials, to 1:3 against (25% likely to be true) from a systematic review of small inconclusive randomised trials, to even lower levels for other architectures.
Table 1: Likelihood of truth of research findings from various typical study architectures
Ratio of true to not true
Positive predictive value
|Confirmatory meta-analysis of good quality RCTs|
|Adequately powered RCT with little bias and 1:1 pre-study odds|
|Meta-analysis of small, inconclusive studies|
|Underpowered, but poorly performed phase I/II RCT|
|Underpowered, but well performed phase I/II RCT|
|Adequately powered exploratory epidemiological study|
|Underpowered exploratory epidemiological study|
|Discovery-orientated exploratory research with massive testing|
There is lots more in these fascinating papers, but from here on in it all gets more detailed and more complex without becoming necessarily much easier to understand. There is nothing here that contradicts what we already know, namely that if we accept evidence of poor quality, without validity, or where there are few events or numbers of patients, we are likely, often highly likely, to be misled.
If we concentrate on evidence of high quality, which is valid, and with large numbers, that will hardly ever happen. As Ioannidis also comments, if instead of chasing some ephemeral statistical significance we concentrate our efforts where there is good prior evidence, our chances of getting the true result are better - concentrating on all the evidence. Which may be why clinical trials on pharmaceuticals are so often significant statistically, and in the direction of supporting a drug. Yet even in that very special circumstance, where so much treasure is expended, years of work with positive results can come to naught when the big trials are done and do not produce the expected answer.
- JPA Ioannides. Contradicted and initially stronger effects in highly cited clinical research. JAMA 2005: 294: 218-228.
- R Smith quoting Prof D Eddy, BMJ 1991 303: 798-799.
- JPA Ioannides. Why most published research findings are false. PLoS Medicine 2005 2: e124. (www.plosmedicine.org)