A new attempt at bolstering reproducibility in the sciences has shown rates nearing perfect success when labs adopt a range of quality-enhancing tactics such as preregistration, large sample sizes, material-sharing and self-checks.
, published in?Nature Human Behaviour, involved five years of work at four laboratories – at the University of California campuses at Berkeley and Santa Barbara, Stanford University and the University of Virginia – that conduct social-behavioural research.
Overall, the attempts at replication by a different lab produced an average of 97 per cent of the size of original findings, compared?with typical replication results of about 50 per cent.
“This demonstrates that low replicability is not inevitable,” said a co-author, Brian Nosek, a professor of psychology at the University of Virginia who is also the co-founder and director of the Centre for Open Science.
The finding is especially significant, Professor Nosek said, because it came in the realm of social-behavioural sciences, where studies involve phenomena that can be more complex and difficult to define than is common in other scientific fields.
The Centre for Open Science is a decade-old effort to encourage cooperation and sharing across scientific fields. As part of that mission, it?regularly sponsors?tests of?research reproducibility, to publicise?historically low levels?of?outside replication?of scientific findings and to help identify ways of improving the situation.
Some leading strategies in that direction include requirements for researchers to preregister their work – meaning a public declaration in advance of what question or questions they expect a particular study to answer, to prevent subsequent efforts to attach importance to unexpected outcomes that may have randomly appeared in their data.
Other practices seen as improving research reliability include the use of relatively large sample sizes, to reduce the chances of claims based on random fluctuations in data; attempts at self-replication of findings; and various improvements in the transparency of scientific processes, including the open sharing of all data, descriptions of methods and the actual materials used in a study.
The value of such methods is often presumed, Professor Nosek said. Yet his team pursued its study of them with the understanding that those practices still needed more thorough examination of their effectiveness, especially when used in combinations of such methods, he said.
Over five years, the four participating labs conducted their typical body of research, in such fields as psychology, marketing, advertising, political science, communication, and judgement and decision-making. Each of the four labs then submitted reports on four discoveries, made their own attempts at self-confirmation and tried to replicate the reports from each of the other three participating institutions.
Overall, Professor Nosek’s team found that 86 per cent of the outside replication efforts produced statistically significant affirmations of the studies that they were trying to repeat. And in some of those instances, statistically significant confirmation was not even possible due to sample sizes in the replication effort, he said. On average, he said, the replication attempts found 97 per cent of the effect size reported in the original studies.
Professor Nosek’s co-authors represented the four institutions with participating labs plus Central Connecticut State University, the University of Wisconsin at Madison, Georgetown University, Washington University in St Louis, the University of South Carolina, McGill University, the University of Gothenburg and the Phenoscience Laboratories in Berlin.