The Problem and Promise of Preprints in the Era of Coronavirus

Pre-print servers are exploding amidst a tidal wave of coronavirus papers. Not all of them are particularly good.

Image for post

When we look back on 2020, what will we call it? The year of the pandemic? The year of democracy? From a medical publishing standpoint it’s clear — 2020 is the year of the preprint.

Preprints. Medical manuscripts published for all to see, prior to peer review.

The promise of preprint servers is nothing less than the democratization of medical science. Free, open publishing so researchers and readers of research can come together and make science better. But like all good ideas, it’s about the execution.

While preprint servers like Arxiv have been running for decades servicing the math and physics community, the medical research world has only more recently embraced biorxiv — often for basic science papers — and the newcomer to the scene MedRxiv — for the clinical sciences. Full disclosure — medRxiv is run out of Yale and I have nothing to do with it.

And, according to this research letter in JAMA, medRxiv is taking off in 2020 thanks, of course, to COVID-19.

medRxiv only began in June of 2019. Check out the growth in submissions over the past year.

Image for post

You don’t need to be a statistician to see the sharp uptrend starting in February of 2020. The site went from a median of 5 submissions per day pre-2020 to 51 per day this year. Fully 73% of those submissions were COVID-19 related — just a staggering share of research for a single pathogen, and a testament to the fact that, when speed of publication is an issue pre-print servers really shine.

Now medRxiv is not the only pre-print server — far from it — though it has become the go-to for COVID-19 preprints. Another research letter in JAMA examines the major other pre-print servers — 57 in total — and shows us that pre-prints still have a ways to go to live up to their promise.

The researchers identified 18 best practices for research transparency and reporting, and reviewed which pre-print servers required which policies.

Image for post
Best practices for open science

These are policies like data sharing, ethics approval, funding declarations, that sort of thing. The median server addressed just 1 of the 18 best practices. As beacons of open science, some pre-print servers are sputtering.

And of course preprint servers have had their controversies. Many of the manuscripts — 86% on medRxiv for example — have not (yet) made it into the peer-reviewed literature. This doesn’t mean they are fatally flawed, but the absence of peer-review means that there is really little opportunity for scientific quality control. Many of these papers would never have seen the light of day had it not been for a pre-print server. Whether that’s a good or bad thing, well, depends on the paper.

There have been some notable retractions in the COVID-19 era, like this article, which implied that smoking might be protective in COVID-19, and this one which addressed the fraught issue of hydroxychloroquine for COVID treatment.

But peer review does more than just find shaky data or the occasional fraud. I’ve been through a lot of peer review myself and while people often think of peer-review as about rejecting bad science, or suggesting new experiments, a lot of it is about moderating language. I’ve conducted an experiment, I believe it shows X, and I write that. The peer-reviewers often act to say — be careful — you can’t be sure you’ve proved that — tone it down a bit. That’s a critical part of the process and one that doesn’t happen at all on the pre-print servers.

And that’s really where we get into trouble with pre-print servers. It’s not the servers, it’s what we do with them. And that includes people like me who write about medical studies. News outlets have been trolling through medRxiv to get the scoop on the latest COVID-19 science, often with a minimal nod to its preliminary, non-peer reviewed nature.

If I were in charge of an editorial desk — I would simply instruct my reporters not to use sources from preprint servers — the risk of misinterpretation is just too high. Peer review reduces hype. That means that the peer-reviewed literature doesn’t always make for the most exciting headlines, but it does make for better science.

A version of this commentary first appeared on

Writing about medicine, science, statistics, and the abuses thereof. Commentator at Medscape. Associate Professor of Medicine at Yale University.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store