19th September 2012, 2 min read

How well does peer review work?

Since the second world war, science has relied on what I call traditional peer review. In this form of peer review, researchers send their manuscript to journal. An editor then reviews the manuscript and if it is judged suitable, the manuscript is sent to reviewers who must make a recommendation. If the recommendations are good, the editor might ask the authors to revise their work. The revised manuscript can be sent back to the reviewers. Typically, the authors do not know who the reviewers are.

In computer science and engineering, we often rely also on peer reviewed conferences: they work the same way except that the peer review process is much shorter and it typically only involves one round. That is, the manuscript is either accepted as is or rejected definitively.

Governments attribute research grants according to the same peer review process. However, in this case, a research proposal is reviewed instead and there is typically only one round of review. You can theoretically appeal of the decisions, but by the time you do, the funds have been allocated and the review committees might have been dismissed.

Researchers trust peer review. There is a widely held belief that this process can select the best manuscripts reliably, and that it can detect and reject nonsense.

However, most researchers have never looked at the evidence.

So how well does peer review fare? We should first stress that a large fraction (say 94%) of all rejected manuscripts are simply taken to another journal and accepted there. The editor-in-chief of a major computer science journal once told me: you know Daniel, all papers are eventually accepted, don’t forget that. That is, even if you find that some work is flawed, you can only temporarily sink it. You cannot denounce the work publicly in the current system.

Another way to assess the reliability of peer review is to look at inter-reviewer agreement. We find that as soon as one reviewer feels that the manuscript is acceptable, the inter-reviewer agreement falls to between 44% and 66% (Cicchetti, 1991). That is, consensus at the frontiers of science is as elusive as in other forms of human judgment (Wessely, 1998). Who reviews you is often the determining factor: the degree of disagreement within the population of eligible reviewers is such that whether or not a proposal is funded depends in large proportion of cases upon which reviewers happen to be selected for it (Cole et al., 1981).