The Reproducibility Project – an effort to reproduce the findings of 100 articles in three top psychology journals has been recently in the news due to its important findings. The study led by Dr. Brian Nosek of the University of Virginia and the Center for Open Science found that out of 100 experimental and correlational studies published in three psychology journals, less than half of the findings held up when retested. This is a substantial decline indicating a significant problem with scientific publishing and, in consequence, the findings have been widely reported by many news outlets.
The Reproducibility Project was specifically related to psychology publishing, but the more important question is “how much of these findings apply to other scientific fields?” Reproducibility tests cost time and money – this particular study took three years and over 200 volunteers to complete. The evidence from other sciences is not as strong, but a theoretical research paper by John P. A. Ioannidis, titled ‘Why Most Published Research Findings Are False’, shows that research finding is less likely to be true when the studies conducted in a field are smaller; when effect sizes are smaller; when there is a greater number and lesser preselection of tested relationships; where there is greater flexibility in designs, definitions, outcomes, and analytical modes; when there is greater financial and other interest and prejudice; and when more teams are involved in a scientific field in chase of statistical significance.
In a recent interview with Russ Roberts, the host of EconTalk, Brian Nosek said that reproducibility issues are happening across all fields including the life sciences and physical sciences. An active project of his team is in cancer biology and is using the same methodology. People can easily accept that psychology or economy, as “soft” sciences, have reproducibility issues – after all, it is impossible to reestablish the original test conditions. But “hard” sciences? We learned in school the precise trajectory of a pendulum or a ball rolling down the hill. It’s always the same, right?
Unlike Religion, Philosophy, or even the “soft” Sciences, “hard” Sciences are supposed to be falsifiable and therefore fact. But it turns out that’s not the case. Even one of the most solid of all theories, Newtonian Mechanics, has been proven false at the atomic level. Still, the set of theories has not been discarded because it works just fine at the macro size. The classical example of a falsifiable theory: “all swans are white”, can be modified after falsification to “all swans are white, except Cygnus atratus”, and the modified theory remains useful. For most people, a theory that ‘withstands falsification’ is as good as ‘confirmed’. We see Newtonian Mechanics at work every day ever since the pyramids or before, and we think “this is solid fact that will never change”. Not many people know or care that it breaks down at the atomic level.
Science is done by people that carry their own Beliefs – some clearly displayed, and some hidden. Assumptions, range of hypotheses under consideration, and results interpretation are all subjective. Furthermore, humans often suffer from group think. Verification is always limited due to constraints such as the immediate space-time, accuracy, repeatability, and so on. Proper disclosure would require the presenter to list all assumptions as well as all competing hypotheses along with confidence levels for each one of them and for the group of yet unknown hypotheses. For completeness, the Set of Beliefs of the presenter might also inform the audience as to what set of hypotheses were excluded from research.
Science is never settled – if it were, all discussions would stop, together with all scientific progress.