That’s the default described in the first paragraph. I think it would be better if the reviewers had skin in the game…
Gabrielsays:
I will try to summarise your argument against double-blind:
there is no proven evidence
Telling people they have to hide who they are is not a positive message
double-blind renders open scholarship difficult (e.g. arXiv)
If the reviewers are not biased the readers can be
double-blind seems to lead to harsher reviews and higher rejection rates
I lived in conducted research in South America. This is made me keenly aware of the privilege that we enjoy in North America, Europe and East Asia in terms of publishing and attending conferences. I remember that when I was there, the condescension that I would get from reviewers would be really bad. This is led me to think that as bad double-blind is, it is certainly an improvement over single-blind.
What I do now is: kindly ask a colleague to remove the names of the authors, their affiliations, acknowledgment of funding, code repository or any kind of information that could lead to the identity of the authors. The way I have no idea if the paper comes from a high-income country or a well-established lab. Regarding your arguments:
While I agree there’s no evidence, if the reviewers make a bona fide effort to not determine who the authors are, I think it will lead to a fairer reviewing process from low-income countries or smaller labs.
I wish we lived in a world where Slovenia had the same scientific reputation as Switzerland and the same paper from either country had the same chance of being published, this is not going to happen tomorrow. We have to acknowledge that biases do exist and will keep existing for a while and double-blind is one way to go about it.
That is true, open scholarship is difficult. But that is why I refuse to review any paper for which I know the identity of the authors because I saw the article on arXiv. That said not everyone might do this. So yes that is a problem.
That is also true, readers will be biased as well but there is absolutely no reviewing system where that will not happen. But please double-blind lessens the biases inherent to the reviewing process.
Double-blind has led to harsher reviews. Perhaps but the nuance is that they are harsh on everyone.
While I will concede that double-blind is not a panacea, I think it is a step above single-blind. Another possibility is to have the reviews fully open, public and signed and have a publication model like PLoS ONE. I would be probably more in favor of the latter, I would need to think about this. But I would still pick double-blind over single-blind.
While I agree there’s no evidence, if the reviewers make a bona fide effort to not determine who the authors are, I think it will lead to a fairer reviewing process from low-income countries or smaller labs.
That is a falsifiable hypothesis. Remember that Blank and others have reported that outsiders have actually a lower acceptance rate under a double-blind review process.
But I submit to you that even if it is true that people from Slovenia do better under the new system, and we don’t know whether it is true, it does not follow that we should adopt double-blind peer review because double-blind has negative consequences of its own.
Let me summarize some other counterpoints that you do not address.
Why would you care about the biases at the acceptance stage, but not about the biases after the acceptance? If done right, acceptance should only have to do with whether the science is correct. If venue A makes a mistake, you have venue B and venue C, and so forth. Acceptance is a minor issue, unless you disregard the San Franscico declaration and accept that we should assess researchers by the prestige of their venues as opposed to their work itself.
Now, if you are consistent with your belief that people from Slovenia are suffering large prejudices (and they may, to be clear), then you should be very concerned about what happens to them after the double-blind peer review has completed.
So I submit to you that you should demand that the published work itself be either anonymized, or that, at least, we hide from view the affiliation of the authors.
It makes no sense to me to argue that people from Slovenia are being disregarded and then to turn around and broadcast everywhere the affiliation of the authors of the accepted paper.
That is, it makes no sense unless we consider that acceptance is the end game, the prize to win. I strongly object to this view. Regarding my own work, I do not much care about getting it accepted, I care about whether it is good. Getting my work accepted, if it is good, is not a big challenge. There are hundreds of venues. Of course, getting it accepted at a highly prestigious venue could be a problem… but then, again, I turn back to the San Franscico declaration: do we assess people based on the prestige of the venues? Many of us think we should not. It turns research into a numbers game. I reject that.
We have to acknowledge that biases do exist and will keep existing for a while and double-blind is one way to go about it.
Biases exist. This is a scientific fact. How strong they are and in which direction they go and what the trends are… it is a difference story. I submit to you that prejudices against women, black authors, Chinese authors… are far more significant today than they were 50 years ago.
That is also true, readers will be biased as well but there is absolutely no reviewing system where that will not happen. But please double-blind lessens the biases inherent to the reviewing process.
In some sense, double-blind peer review gives even more power to the reviewers. You say that you can count on the honor of the reviewers… that they won’t hunt down the identity of the authors. This is certainly true in general, but it also gives them plausible deniability if they want to be bad actors. I spot this paper which I recognize is from my friends. I argue strongly in favor of it, something I would not be allowed to do under the previous system.
Furthermore, it may prevent people from controlling their biases. Is this paper just poorly written by a lazy grad student, or are the authors reasonable people from a non-English country?
Double-blind has led to harsher reviews. Perhaps but the nuance is that they are harsh on everyone.
Please consider the long term effect. Our current system chase away some personality types. If you dislike politics and harsh reviews, you are more likely to leave the field. People like myself, who are hard to discourage, are favoured under the current system, but I am not entirely sure I like that. You end up getting a field dominated by people with very strong egos. In turn, this reinforces the field as highly competitive one.
There is some irony in trying to use double-blind peer review along with the words diversity and inclusion. That is, I see no evidence that double-blind peer review is not, if anything, anchoring the usual strongholds. In my experience, it is not, generally, the people from Slovenia clamoring for double-blind peer review. It is the people from the top research schools.
Given that people are naturally self-interested, and assuming that the hypothesis holds (people from big schools are being displaced by double-blind peer review), would you not expect some resistance from them? You may argue that they are saints that somehow want to do good… but I submit to you that double-blind peer review is not, in the least, harmful to the conventional power hierarchy. It does not displace it. People from Slovenia are not moving in.
Remember: we have reasons to believe that double-blind peer review is harmful to outsiders.
Another possibility is to have the reviews fully open, public and signed and have a publication model like PLoS ONE. I would be probably more in favor of the latter, I would need to think about this. But I would still pick double-blind over single-blind.
I strongly favor a PLoS ONE model. I submit to you that it is the real threat to the established power hierarchy, not double-blind peer review.
Let us put all good research on a level playing field.
Gabrielsays:
I do not think that you and I disagree by much, what I am arguing for is that double-blind is a step above the ubiquitous single-blind. Both however do not take care of:
1) biases post-acceptance in citations, exposure (how many tweets, articles, blog posts about it) etc.
2) Vitriolic reviews.
1) is problematic under any system. There is no fixing that unless you have anonymous papers which would be a quagmire to claim authorship.
Under double-blind, whether 2) would lead to a significantly higher increase of narcissists in academia, that remains to be demonstrated. Double-blind is not impervious to nefarious reviewers who hunt down the identity of the authors.
I really like the arXiv system. However, this system merely invites comments and people do not have an incentive to review. What would be interesting is to have a reward for reviewers for arXiv which would lead to many versions of the article à la f1000. I am arguing double-blind vs fully open. I favor a fully open model. I am arguing for double-blind against single-blind.
Why would you care about the biases at the acceptance stage, but not about the biases after the acceptance?
We should obviously care about both – but why should A is bad imply don't even try to fix B? Also, I don’t think these two are not independent from each other. A biased is often grounded in what you are used to see. You never see paper by women from South America? You will not approach such a paper in the same way when reviewing it, compared to a paper with five male authors from North America. So – to counter this bias (in reviewers and in readers), we should try to have equal representation publication-wise. But for that, we first need to fix the reviewing bias.
Acceptance is a minor issue, unless you disregard the San Franscico declaration and accept that we should assess researchers by the prestige of their venues as opposed to their work itself.
I don’t accept that we should assess researchers by the prestige of the venues, but I think it is unrealistic to assume that it doesn’t happen. I just completed my PhD and thought about staying in science (which I won’t), and the pressure to publish at highly regarded venues was definitively a factor that contributed to this decision. And I’m a white male at a European university. I’ve seen how professors are hired at my institution, and “where did that person publish” is definitely one of the major contributing factors.
Maybe we’ll be at some point where “acceptance is a minor issue” will be true for virtually all researchers, but I don’t think we’re at that point yet.
If done right, acceptance should only have to do with whether the science is correct.
This is a very weird sentence in a comment arguing that there should be names attached to papers under review. 😉 I fully agree, though. This also counters your argument that outsiders have a better chance in single-blind review – whether their papers are accepted or not should be decided based on the science, not on their outsider status.
We should obviously care about both – but why should A is bad imply don’t even try to fix B?
The same people who, rightly, observe that there are flaws in the review process, never talk about the other flaws which should be much more significant. That is, it is much more important how reader react than how a given publication reacts. So why are we concerned solely about the former minor problem? Implicitly, it is because we want peer review to pick winners. That’s what I am calling out.
I am saying that we have the wrong focus.
Suppose you have a mole on your arm and you are going blind. You go to the doctor and he totally ignores the fact that you are going blind. Wouldn’t you be curious as to why he focuses on the secondary problem instead of the primary one?
Evidently, we think that paper acceptance is what matters.
This goes against me deeply held values.
I don’t accept that we should assess researchers by the prestige of the venues, but I think it is unrealistic to assume that it doesn’t happen. I just completed my PhD and thought about staying in science (which I won’t), and the pressure to publish at highly regarded venues was definitively a factor that contributed to this decision. And I’m a white male at a European university. I’ve seen how professors are hired at my institution, and “where did that person publish” is definitely one of the major contributing factors. (…) Maybe we’ll be at some point where “acceptance is a minor issue” will be true for virtually all researchers, but I don’t think we’re at that point yet.
You have to distinguish between what people think is true from what is actually true. To have impact as a scientist, I do not think you need to be from a prestigious institution, to publish in highly selective venues or to publish many papers.
I understand that it is a commonly held belief, but it does not make it true.
This is a very weird sentence in a comment arguing that there should be names attached to papers under review. 😉 I fully agree, though. This also counters your argument that outsiders have a better chance in single-blind review – whether their papers are accepted or not should be decided based on the science, not on their outsider status.
I do not think it is weird. If you stop picking winners, then you do not have to worry so much about biases.
If instead of tasking the reviewers with picking this year’s top 10 papers, you just ask them “is this good science”… then you do not have to worry so much about biases. I am not dismissing the existence of unwanted prejudice, but I am saying that you can lower drastically the effect.
Very interesting discussion! Before reading this blog post, I strongly supported double-blind reviewing. Now I’m not so sure.
I recently (2019) published a paper in PLOS ONE, and it was a pleasant experience. The reviewers gave helpful comments and the time from submission to publication was short.
“We evaluate research on scientific validity, strong methodology, and high ethical standards—not perceived significance.”
This makes a lot of sense to me. “Perceived significance” is extremely subjective. I don’t think anybody can reliably predict the future impact (significance) of a paper. This is not something we should be asking reviewers to do. Reviewers tend to err on the side of conservatism, so truly novel work will tend to be rejected.
If we drop “perceived significance” as a requirement for acceptance, suddenly many issues go away. Reviewers are forced to focus on (objective) correctness instead of (subjective) significance. There is no reason for secrecy. Why should a reviewer want to be anonymous when they are merely making a statement of fact (not a judgment of significance) about an error in a paper? Why should an author want to be anonymous when they know their paper will be accepted, after all factual errors are corrected?
Do I, as a reader, want a reviewer to tell me whether a paper is significant, in their opinion? No. I can make that decision on my own.
Do I, as a reader, want a reviewer to point out an error in a paper? Yes.
As a devil’s advocate — let me cite our post on the topic :
Functions of conferences.
For the author, conferences provide:
Knowledge dissemination: “I want people to know about the new knowledge I discovered.” The conference promises a certain minimum level of attention one’s work gets. In other words, if you don’t publish at top conferences, nobody would read your work.
Feedback: I would like people to check my results through the review process and discussion.
Formal goodies: checking boxes, required to defend Ph.D., for tenure package, performance review, to put into grant report, etc.
Certification. Publishing at CVPR is hard, therefore valuable.
Reputation-building: Listing certain conferences on C.V. as a way of building one’s name as a scientist.
Networking: meeting with peers, potential employers, etc.
Out the six functions, only 2.5 (dissemination, feedback, and part of certification) are related to the science as a knowledge mining process, or the first definition of science. We will get back to it shortly, now functions for the audience:
For the audience:
Prefiltering: time is limited, so we outsource the selection of what we are reading to the reviewers.
Certification: time is limited, so we outsource the quality control and result check to the reviewers. We create a basic classifier: “If the paper is published by a top-conference, it is true.”
Special case of certification: for people outside the field without the basic qualifications to select work that meets basic quality guarantees.
Authors promise to answer our questions (symmetrical to “attention for the author from audience,” and audience gets the guarantee that questions about the work will be answered at the talk or poster session).
Partly “certification” serves science as a knowledge mining process, reducing a barrier to build on top of others’ work. The rest of the functions serve the science-as-implemented current model of professional scientific work and help the community to cope with resources (time, money, attention) scarcity.
END OF QUOTE.
The problems with reviews arise because of those “Prefiltering” and “Formal goodies” functions of the conference. On the one hand, they are there for a reason. On the other hand — they are not related to “science as a science”, but the “business of science”.
The current community “solution” to the problem is arXiv, as it allows the reader to do the pre-filtering or no filtering, or whatever they want.
Thank you for the excellent blogpost! It is bizarre, how people are defending holy double blind review — and even argue on banning arXiv before submission.
I debated Claire so yes, I know her points. I agree with her facts, but I come to a different conclusion. Note that she conceded that the evidence regarding the benefits to women was mixed, something that her blog post may not reflect.
There is also a competing move toward more openness where everyone’s identity is disclosed.
This seems to hint at a common misconception about open peer reviewing, which is repeated in your comment: “I think you cannot have both double-blind and open.”
Open peer reviewing in fact means that peer reviews are posted online for everyone to see, and everyone (official reviewers, authors, readers) can intervene in the discussion about a submission. But it is actually not incompatible with double-blind peer reviewing: the official reviewers can be pseudonymous, and the identity of paper authors hidden from them, even if the discussion is otherwise held in the open. For instance, the OpenReview.net platform does open peer reviewing, but it features venues that are completely public, single-blind (the identity of the official reviewers remains hidden if they wish), or even double-blind (single-blind plus the identity of paper authors is hidden from official reviewers during the reviewing process).
Clearly, we believe that we can effectively combat undesirable prejudices in hiring since most employers do not hire based on a double-blind process.
It does not seem clear to me at all, to put it mildly, that undesirable prejudices are effectively avoided when hiring nowadays. And searching on the web, there seems to be a trend towards blinding resumes, for instance I found https://medium.com/@sprintcv/anonymize-cv-how-to-do-it-efficiently-using-sprintcv-1cc03d94a23b or https://www.beapplied.com/post/how-to-anonymise-cvs. See also https://en.wikipedia.org/wiki/Blind_audition for an example in the musical domain. As less ambitious measures, resumes are often required not to include unnecessary info about the candidate, e.g., not have a photo, or sometimes not indicate the first name to avoid gender biases, which I hope we can all agree is a good idea (though it does not prevent introducing bias in favor of underrepresented minorities independently from resume evaluation).
The main argument against ubiquitous resume blinding seems to be that in many areas it is challenging to do it while making it possible to evaluate the applicant. By contrast, in academia, the identity of paper authors is trivial to hide from submissions and completely irrelevant to the merits of the paper.
Firstly, the evidence for the benefits of double-blind peer reviews is a set of anecdotes
The biases of a single-blind conference in terms of author fame, nationalities, gender (from first names), etc., have in fact been quantified during reviewing for the WSDM’17 conference. Here is the study: https://www.pnas.org/content/early/2017/11/13/1707323114.full.
This does not mean that double-blind reviewing eliminates these biases (it only shows biases at a single-blind conference), but clearly indicates that showing author information to reviewers is at least dangerous.
Telling someone from a poorly known organization, from a poor or non-English country or from non-dominant gender identity that they need to hide who they are to be treated fairly is not entirely a positive message
The simple way to apply this is to require double-blind for all submissions at a given venue (not have optional double-blinding); and as far as I know this is in fact how many venues work.
I certainly want to live in a world where a woman can publish her work as a woman.
I certainly do too, but just because double-blind reviewing doesn’t solve the deeper problem doesn’t mean that it isn’t a valuable interim solution.
I practice what I call open scholarship. Obviously, it means I cannot reasonably take part in double-blind venues.
This may refute strong versions of double-blind reviewing where authors are required to ensure that reviewers cannot unblind them even if they try (e.g., there should be no arXiv preprint, etc.); like in Dmytro Mishkin’s comment. I agree that these implementations are impractical and an obstacle to open scholarship (and I believe that they are undesirable).
However, open scholarship is not an obstacle at all against lighter double-blind reviewing where authors are just required to omit author information and anonymize self-citations in the submitted article. Reviewers are expected not to try to unblind them (but it’s OK if they accidentally do, e.g., if they remember the work from an earlier preprint). This does not avoid all biases, but it is already very helpful as it works most of the time. This light double-blind reviewing is in fact very common, it is what is done at STACS’21 among many other venues.
Yet, at best, double-blind peer review might help with getting papers accepted, but it does nothing for post-publication assessment.
Just because double-blind reviewing doesn’t solve all problems, doesn’t mean it isn’t the right thing to do to solve part of the problem, right?
Blank reported that authors from outside academia have a lower acceptance rate under double-blind peer review presumably because reviewers, when they can, tend to give a chance to outsiders despite the fact that outsider do not conform to the field’s orthodoxy as well as insiders may.
I wasn’t aware of this study, and I agree it may be a valid argument. That said, relying on a biased system just to profit from some of the favorable biases doesn’t seem ideal. If there is a goal to judge outsider papers more favorably, having deliberate efforts in this direction (special tracks, quotas, or adding this information as input at a later reviewing stage) would seem like a better idea.
Moreover, Blank indicates that double-blind peer review is overall harsher. This “harsh” nature has been replicated and quantified. Double-blind peer review manuscripts are less likely to be successful than single-blind peer review manuscripts.
The solution is simple: impose double-blind reviewing to all submissions at a venue to ensure that they are treated equally.
Having hasher reviews and lower acceptance rates may not be a positive.
I agree that there is a huge problem of the reviewing culture being harsh and unwelcoming, especially to newcomers, and especially to people from underrepresented groups who do not feel legitimate in academia. It is urgent to fix this problem, but it has nothing to do with double-blind reviewing; even if double-blind reviewing removes the mitigating factor where reviewers will be kinder with people that they know.
The introduction of double-blind peer review is partly justified by the mission we give the reviewers: select only the very best work.
This is far from being the only justification. Fighting bias is a much better justification for double-blind reviewing. Even in a system which wants to publish all minimally interesting papers, biases can always mean, e.g., that famous authors will always get a free pass because reviewers trust them, or outsider authors will attract more scrutiny.
[points about having more diverse PCs and about the stupidity of low acceptance rates]
double-blind peer review is itself a rather crude and pessimistic solution that has several undesirable consequences. We can do better.
The way I see it, light double-blind reviewing across an entire venue is a very simple solution, which we can expect to help avoid measurable biases, and is trivial to implement. Of course, it doesn’t solve all of academia’s numerous other problems, but that’s not a valid argument against it, I believe. Also, there are also many people who mean something different and less convenient when they say “double-blind reviewing”, so indeed one has to be careful to distinguish the different possible implementations.
For the kind of double-blind reviewing I’m defending, the only argument that I personally found convincing in your post is the one about outsiders faring less well. But let’s turn the system around: if the standard were for all venues to practice double-blind reviewing, and we wanted to bias the system towards accepting more papers by outsiders (which may indeed be a very reasonable kind of bias to introduce), would the right solution be to completely disclose the full author names and affiliations to reviewers on papers from the very beginning, and hope that they’d implicitly factor it in, precisely how we’d like? This wouldn’t be what we’d do, right?
I understand the argument that, in fully open scholarship, double-blind peer reviewing may be impractical and no longer desirable. But even if we’re very optimistic about academic practices evolving, the process of reviewing papers for “acceptance” in some sense is still going to stay for a long time I believe: even with open reviews à la OpenReview.net, even with epijournals, even with more welcoming reviews and more reasonable acceptance rates, etc. Even if we finally move to a fully open system with open platforms having completely overthrown traditional conferences and journals, you’ll probably want to keep a system to have people vouch for the correctness and interest of papers, or to give awards to the very best papers. And for such systems, which is what reviewing does nowadays (admittedly in an imperfect, harsh, and excessively selective fashion), it probably makes sense to hide the identity of reviewers and of authors — to avoid bias and because it’s completely irrelevant information that’s really not complicated to remove.
In the context of my post, open is by opposition to double-blind, and single-blind.
I do not view double-blind as a partial fix. I view it as a way to justify the continued existence of an elitist system. A conference like NeurIPS has a double-blind peer review system. The bulk of the papers are from a small set of elite institutions. Even if you go toward the very end of the list, to the institutions with very few papers, you are still in elite territory. You will not find anyone from Senegal or from a small college. The board is made almost entirely with people from elite institutions. But they are very inclusive, aren’t they, because they use double-blind peer review? No. they are not inclusive. They are elitist.
I believe you will find that biases under a system like PLoS One are much less of a concern. Once you lower the stakes for peer review, you have less concerns about biases.
The WSDM 2017 study is one data point and not representative of the literature. We still agree that homophily and prestige biases are very real: the evidence is overwhelming, but it does not follow that blinding will make things better.
Let me make this comment more precise. I do not mean that the people publishing at these venues are elitist. I mean that the venue itself is elitist. That is, it is an elitist institution. The people in it may not be.
My point is that it is very hard to break in if you are not already an insider (part of an associated elite institution).
(I chose NeurIPS deliberately because I have no relation with it. I should disclose that I have published papers at similar venues.)
Thanks for your answers. I completely agree that reviews like NeurIPS and others are elitist in the way you describe. That said, I’m afraid they will remain a defining part of our work and of research for the years to come, with researchers like you and me perpetuating this system by submitting our work there, reviewing there, organizing them, etc., and being evaluated based on this.
Knowing this, and while campaigning to fix the broader problem that this system is elitist and generally broken, I do think it makes sense to wonder what’s the best way to make the system work better (or probably better) with trivial adjustments. I doubt people seriously believe that NeurIPS and others are not elitist at all just because they are double-blind — and if some people think that, this is the problem, not double-blind reviewing itself.
Choosing single-blind or double-blind is one such simple adjustment. And it doesn’t seem plausible to me that reverting to single-blind reviewing and adding back unfiltered author information to NeurIPS reviewing would give a better outcome, as opposed to other more deliberate solutions like quotas or using author information after scientific reviewing has been done. And of course single-blind venues comparable to NeurIPS are also elitist.
The title of your post is “Double-blind peer review is a bad idea”. I’d agree that it’s not a perfect idea, or not going far enough, or maybe not so useful, or not addressing the right problem. But I fail to see why existing venues shouldn’t take the trivial step of switching to it right now, or why existing double-blind venues would benefit from reverting to single-blind.
But I fail to see why existing venues shouldn’t take the trivial step of switching to it right now
The mere fact that we recognize a problem, and that there is some action related to the problem, does not imply that we must proceed with that action. Our tendency to do so relies on a fallacy known as the politician’s syllogism.
I believe you will find that biases under a system like PLoS One are much less of a concern. Once you lower the stakes for peer review, you have less concerns about biases.
This comment made me think about the publication fees, which can be a barrier for some authors. I looked on the PLoS One site, and they have addressed this issue very thoroughly:
It needn’t be the authors who pay to maintain good journals. Some open-access journals are free to both authors and readers, with the hosting costs paid by universities, public research institutes, or sponsors. LMCS in my field is one such example, and there are of course other free services for related things, like arXiv. The hosting cost of a platform running a Web app to manage reviewing plus serve some static PDFs is simply not that high.
I don’t think the author-pays model, with APCs over $1000 like PLOS does, is the right model, even when trying to plug the gaps with exemptions for underrepresented countries. (Compare this to LIPIcs’s 60 EUR per paper fee.)
But indeed this criticism about publication costs also applies to essentially all pre-COVID conferences.
It is really an admirable initiative. It seems that the whole PLoS institution was well thought out.
Dmytro Mishkinsays:
I think that 1k USD is better than Nature’s 9.5k USD, but still quite a lot for an open access.
For example, JMLR is free and created as free alternative to paid journals.
On the other hand, it has quite low acceptance rate, do cannot be directly compared to PLoS One
Harry P. Knisssays:
This is one of the worst-argued essays I’ve ever read.
The bias within peer review tends to be on the basis of institutional prestige and fame – i.e. a paper from Deepmind is probably better than a paper from a school in Nova Scotia. I think that it would be very rare for a reviewer to find the name of an author, search around to figure out their race/gender identity, and then have such indiscriminate rage at a stranger that they’d try to get their paper rejected. Some fraction of the time, a person’s name tells you their ethnicity, so maybe some dumb Hong Kong supremacist protestor would try to get a Chinese person’s paper rejected. However most of these dumb hong kong protestors are too focused on terrorism for research.
The institutional bias is a serious issue, because a lot of junior reviewers will be biased by seeing a famous name.
What’s your take on single-blind reviews where reviewers are anonymous?
That’s the default described in the first paragraph. I think it would be better if the reviewers had skin in the game…
I will try to summarise your argument against double-blind:
there is no proven evidence
Telling people they have to hide who they are is not a positive message
double-blind renders open scholarship difficult (e.g. arXiv)
If the reviewers are not biased the readers can be
double-blind seems to lead to harsher reviews and higher rejection rates
I lived in conducted research in South America. This is made me keenly aware of the privilege that we enjoy in North America, Europe and East Asia in terms of publishing and attending conferences. I remember that when I was there, the condescension that I would get from reviewers would be really bad. This is led me to think that as bad double-blind is, it is certainly an improvement over single-blind.
What I do now is: kindly ask a colleague to remove the names of the authors, their affiliations, acknowledgment of funding, code repository or any kind of information that could lead to the identity of the authors. The way I have no idea if the paper comes from a high-income country or a well-established lab. Regarding your arguments:
While I agree there’s no evidence, if the reviewers make a bona fide effort to not determine who the authors are, I think it will lead to a fairer reviewing process from low-income countries or smaller labs.
I wish we lived in a world where Slovenia had the same scientific reputation as Switzerland and the same paper from either country had the same chance of being published, this is not going to happen tomorrow. We have to acknowledge that biases do exist and will keep existing for a while and double-blind is one way to go about it.
That is true, open scholarship is difficult. But that is why I refuse to review any paper for which I know the identity of the authors because I saw the article on arXiv. That said not everyone might do this. So yes that is a problem.
That is also true, readers will be biased as well but there is absolutely no reviewing system where that will not happen. But please double-blind lessens the biases inherent to the reviewing process.
Double-blind has led to harsher reviews. Perhaps but the nuance is that they are harsh on everyone.
While I will concede that double-blind is not a panacea, I think it is a step above single-blind. Another possibility is to have the reviews fully open, public and signed and have a publication model like PLoS ONE. I would be probably more in favor of the latter, I would need to think about this. But I would still pick double-blind over single-blind.
Thank you Gabriel for your excellent comment.
That is a falsifiable hypothesis. Remember that Blank and others have reported that outsiders have actually a lower acceptance rate under a double-blind review process.
But I submit to you that even if it is true that people from Slovenia do better under the new system, and we don’t know whether it is true, it does not follow that we should adopt double-blind peer review because double-blind has negative consequences of its own.
Let me summarize some other counterpoints that you do not address.
Why would you care about the biases at the acceptance stage, but not about the biases after the acceptance? If done right, acceptance should only have to do with whether the science is correct. If venue A makes a mistake, you have venue B and venue C, and so forth. Acceptance is a minor issue, unless you disregard the San Franscico declaration and accept that we should assess researchers by the prestige of their venues as opposed to their work itself.
Now, if you are consistent with your belief that people from Slovenia are suffering large prejudices (and they may, to be clear), then you should be very concerned about what happens to them after the double-blind peer review has completed.
So I submit to you that you should demand that the published work itself be either anonymized, or that, at least, we hide from view the affiliation of the authors.
It makes no sense to me to argue that people from Slovenia are being disregarded and then to turn around and broadcast everywhere the affiliation of the authors of the accepted paper.
That is, it makes no sense unless we consider that acceptance is the end game, the prize to win. I strongly object to this view. Regarding my own work, I do not much care about getting it accepted, I care about whether it is good. Getting my work accepted, if it is good, is not a big challenge. There are hundreds of venues. Of course, getting it accepted at a highly prestigious venue could be a problem… but then, again, I turn back to the San Franscico declaration: do we assess people based on the prestige of the venues? Many of us think we should not. It turns research into a numbers game. I reject that.
Biases exist. This is a scientific fact. How strong they are and in which direction they go and what the trends are… it is a difference story. I submit to you that prejudices against women, black authors, Chinese authors… are far more significant today than they were 50 years ago.
In some sense, double-blind peer review gives even more power to the reviewers. You say that you can count on the honor of the reviewers… that they won’t hunt down the identity of the authors. This is certainly true in general, but it also gives them plausible deniability if they want to be bad actors. I spot this paper which I recognize is from my friends. I argue strongly in favor of it, something I would not be allowed to do under the previous system.
Furthermore, it may prevent people from controlling their biases. Is this paper just poorly written by a lazy grad student, or are the authors reasonable people from a non-English country?
Please consider the long term effect. Our current system chase away some personality types. If you dislike politics and harsh reviews, you are more likely to leave the field. People like myself, who are hard to discourage, are favoured under the current system, but I am not entirely sure I like that. You end up getting a field dominated by people with very strong egos. In turn, this reinforces the field as highly competitive one.
There is some irony in trying to use double-blind peer review along with the words diversity and inclusion. That is, I see no evidence that double-blind peer review is not, if anything, anchoring the usual strongholds. In my experience, it is not, generally, the people from Slovenia clamoring for double-blind peer review. It is the people from the top research schools.
Given that people are naturally self-interested, and assuming that the hypothesis holds (people from big schools are being displaced by double-blind peer review), would you not expect some resistance from them? You may argue that they are saints that somehow want to do good… but I submit to you that double-blind peer review is not, in the least, harmful to the conventional power hierarchy. It does not displace it. People from Slovenia are not moving in.
Remember: we have reasons to believe that double-blind peer review is harmful to outsiders.
I strongly favor a PLoS ONE model. I submit to you that it is the real threat to the established power hierarchy, not double-blind peer review.
Let us put all good research on a level playing field.
I do not think that you and I disagree by much, what I am arguing for is that double-blind is a step above the ubiquitous single-blind. Both however do not take care of:
1) biases post-acceptance in citations, exposure (how many tweets, articles, blog posts about it) etc.
2) Vitriolic reviews.
1) is problematic under any system. There is no fixing that unless you have anonymous papers which would be a quagmire to claim authorship.
Under double-blind, whether 2) would lead to a significantly higher increase of narcissists in academia, that remains to be demonstrated. Double-blind is not impervious to nefarious reviewers who hunt down the identity of the authors.
I really like the arXiv system. However, this system merely invites comments and people do not have an incentive to review. What would be interesting is to have a reward for reviewers for arXiv which would lead to many versions of the article à la f1000. I am arguing double-blind vs fully open. I favor a fully open model. I am arguing for double-blind against single-blind.
Thanks for the clarification. I think you cannot have both double-blind and open. Open is far more challenging to the existing power structure.
That I agree with! Plus fully open would enable us to weed out problematic reviewers.
We should obviously care about both – but why should
A is bad
implydon't even try to fix B
? Also, I don’t think these two are not independent from each other. A biased is often grounded in what you are used to see. You never see paper by women from South America? You will not approach such a paper in the same way when reviewing it, compared to a paper with five male authors from North America. So – to counter this bias (in reviewers and in readers), we should try to have equal representation publication-wise. But for that, we first need to fix the reviewing bias.I don’t accept that we should assess researchers by the prestige of the venues, but I think it is unrealistic to assume that it doesn’t happen. I just completed my PhD and thought about staying in science (which I won’t), and the pressure to publish at highly regarded venues was definitively a factor that contributed to this decision. And I’m a white male at a European university. I’ve seen how professors are hired at my institution, and “where did that person publish” is definitely one of the major contributing factors.
Maybe we’ll be at some point where “acceptance is a minor issue” will be true for virtually all researchers, but I don’t think we’re at that point yet.
This is a very weird sentence in a comment arguing that there should be names attached to papers under review. 😉 I fully agree, though. This also counters your argument that outsiders have a better chance in single-blind review – whether their papers are accepted or not should be decided based on the science, not on their outsider status.
Thank you for your great comment Lukas.
We should obviously care about both – but why should A is bad imply don’t even try to fix B?
The same people who, rightly, observe that there are flaws in the review process, never talk about the other flaws which should be much more significant. That is, it is much more important how reader react than how a given publication reacts. So why are we concerned solely about the former minor problem? Implicitly, it is because we want peer review to pick winners. That’s what I am calling out.
I am saying that we have the wrong focus.
Suppose you have a mole on your arm and you are going blind. You go to the doctor and he totally ignores the fact that you are going blind. Wouldn’t you be curious as to why he focuses on the secondary problem instead of the primary one?
Evidently, we think that paper acceptance is what matters.
This goes against me deeply held values.
I don’t accept that we should assess researchers by the prestige of the venues, but I think it is unrealistic to assume that it doesn’t happen. I just completed my PhD and thought about staying in science (which I won’t), and the pressure to publish at highly regarded venues was definitively a factor that contributed to this decision. And I’m a white male at a European university. I’ve seen how professors are hired at my institution, and “where did that person publish” is definitely one of the major contributing factors. (…) Maybe we’ll be at some point where “acceptance is a minor issue” will be true for virtually all researchers, but I don’t think we’re at that point yet.
You have to distinguish between what people think is true from what is actually true. To have impact as a scientist, I do not think you need to be from a prestigious institution, to publish in highly selective venues or to publish many papers.
I understand that it is a commonly held belief, but it does not make it true.
This is a very weird sentence in a comment arguing that there should be names attached to papers under review. 😉 I fully agree, though. This also counters your argument that outsiders have a better chance in single-blind review – whether their papers are accepted or not should be decided based on the science, not on their outsider status.
I do not think it is weird. If you stop picking winners, then you do not have to worry so much about biases.
If instead of tasking the reviewers with picking this year’s top 10 papers, you just ask them “is this good science”… then you do not have to worry so much about biases. I am not dismissing the existence of unwanted prejudice, but I am saying that you can lower drastically the effect.
Very interesting discussion! Before reading this blog post, I strongly supported double-blind reviewing. Now I’m not so sure.
I recently (2019) published a paper in PLOS ONE, and it was a pleasant experience. The reviewers gave helpful comments and the time from submission to publication was short.
PLOS ONE publication policy is:
https://journals.plos.org/plosone/static/publish
“We evaluate research on scientific validity, strong methodology, and high ethical standards—not perceived significance.”
This makes a lot of sense to me. “Perceived significance” is extremely subjective. I don’t think anybody can reliably predict the future impact (significance) of a paper. This is not something we should be asking reviewers to do. Reviewers tend to err on the side of conservatism, so truly novel work will tend to be rejected.
If we drop “perceived significance” as a requirement for acceptance, suddenly many issues go away. Reviewers are forced to focus on (objective) correctness instead of (subjective) significance. There is no reason for secrecy. Why should a reviewer want to be anonymous when they are merely making a statement of fact (not a judgment of significance) about an error in a paper? Why should an author want to be anonymous when they know their paper will be accepted, after all factual errors are corrected?
Do I, as a reader, want a reviewer to tell me whether a paper is significant, in their opinion? No. I can make that decision on my own.
Do I, as a reader, want a reviewer to point out an error in a paper? Yes.
Thank you Peter. That is the point I am trying to communicate, though your comment might be doing a better job than my little essay.
As a devil’s advocate — let me cite our post on the topic :
Functions of conferences.
For the author, conferences provide:
Knowledge dissemination: “I want people to know about the new knowledge I discovered.” The conference promises a certain minimum level of attention one’s work gets. In other words, if you don’t publish at top conferences, nobody would read your work.
Feedback: I would like people to check my results through the review process and discussion.
Formal goodies: checking boxes, required to defend Ph.D., for tenure package, performance review, to put into grant report, etc.
Certification. Publishing at CVPR is hard, therefore valuable.
Reputation-building: Listing certain conferences on C.V. as a way of building one’s name as a scientist.
Networking: meeting with peers, potential employers, etc.
Out the six functions, only 2.5 (dissemination, feedback, and part of certification) are related to the science as a knowledge mining process, or the first definition of science. We will get back to it shortly, now functions for the audience:
For the audience:
Prefiltering: time is limited, so we outsource the selection of what we are reading to the reviewers.
Certification: time is limited, so we outsource the quality control and result check to the reviewers. We create a basic classifier: “If the paper is published by a top-conference, it is true.”
Special case of certification: for people outside the field without the basic qualifications to select work that meets basic quality guarantees.
Authors promise to answer our questions (symmetrical to “attention for the author from audience,” and audience gets the guarantee that questions about the work will be answered at the talk or poster session).
Partly “certification” serves science as a knowledge mining process, reducing a barrier to build on top of others’ work. The rest of the functions serve the science-as-implemented current model of professional scientific work and help the community to cope with resources (time, money, attention) scarcity.
END OF QUOTE.
The problems with reviews arise because of those “Prefiltering” and “Formal goodies” functions of the conference. On the one hand, they are there for a reason. On the other hand — they are not related to “science as a science”, but the “business of science”.
The current community “solution” to the problem is arXiv, as it allows the reader to do the pre-filtering or no filtering, or whatever they want.
https://amytabb.com/ts/2020_08_21/
Daniel, if my comment was useful, it is only because your blog post changed my mind and made me look at things differently.
Hi,
Thank you for the excellent blogpost! It is bizarre, how people are defending holy double blind review — and even argue on banning arXiv before submission.
We have recently touched this topic in position paper “ArXiving Before Submission Helps Everyone” and would reference your blog in the next version of the paper
Best, Dmytro
Have you considered the evidence mentioned in this blog post?
https://www.cs.cmu.edu/~clegoues/double-blind.html
I debated Claire so yes, I know her points. I agree with her facts, but I come to a different conclusion. Note that she conceded that the evidence regarding the benefits to women was mixed, something that her blog post may not reflect.
This seems to hint at a common misconception about open peer reviewing, which is repeated in your comment: “I think you cannot have both double-blind and open.”
Open peer reviewing in fact means that peer reviews are posted online for everyone to see, and everyone (official reviewers, authors, readers) can intervene in the discussion about a submission. But it is actually not incompatible with double-blind peer reviewing: the official reviewers can be pseudonymous, and the identity of paper authors hidden from them, even if the discussion is otherwise held in the open. For instance, the OpenReview.net platform does open peer reviewing, but it features venues that are completely public, single-blind (the identity of the official reviewers remains hidden if they wish), or even double-blind (single-blind plus the identity of paper authors is hidden from official reviewers during the reviewing process).
It does not seem clear to me at all, to put it mildly, that undesirable prejudices are effectively avoided when hiring nowadays. And searching on the web, there seems to be a trend towards blinding resumes, for instance I found https://medium.com/@sprintcv/anonymize-cv-how-to-do-it-efficiently-using-sprintcv-1cc03d94a23b or https://www.beapplied.com/post/how-to-anonymise-cvs. See also https://en.wikipedia.org/wiki/Blind_audition for an example in the musical domain. As less ambitious measures, resumes are often required not to include unnecessary info about the candidate, e.g., not have a photo, or sometimes not indicate the first name to avoid gender biases, which I hope we can all agree is a good idea (though it does not prevent introducing bias in favor of underrepresented minorities independently from resume evaluation).
The main argument against ubiquitous resume blinding seems to be that in many areas it is challenging to do it while making it possible to evaluate the applicant. By contrast, in academia, the identity of paper authors is trivial to hide from submissions and completely irrelevant to the merits of the paper.
The biases of a single-blind conference in terms of author fame, nationalities, gender (from first names), etc., have in fact been quantified during reviewing for the WSDM’17 conference. Here is the study: https://www.pnas.org/content/early/2017/11/13/1707323114.full.
This does not mean that double-blind reviewing eliminates these biases (it only shows biases at a single-blind conference), but clearly indicates that showing author information to reviewers is at least dangerous.
The simple way to apply this is to require double-blind for all submissions at a given venue (not have optional double-blinding); and as far as I know this is in fact how many venues work.
I certainly do too, but just because double-blind reviewing doesn’t solve the deeper problem doesn’t mean that it isn’t a valuable interim solution.
This may refute strong versions of double-blind reviewing where authors are required to ensure that reviewers cannot unblind them even if they try (e.g., there should be no arXiv preprint, etc.); like in Dmytro Mishkin’s comment. I agree that these implementations are impractical and an obstacle to open scholarship (and I believe that they are undesirable).
However, open scholarship is not an obstacle at all against lighter double-blind reviewing where authors are just required to omit author information and anonymize self-citations in the submitted article. Reviewers are expected not to try to unblind them (but it’s OK if they accidentally do, e.g., if they remember the work from an earlier preprint). This does not avoid all biases, but it is already very helpful as it works most of the time. This light double-blind reviewing is in fact very common, it is what is done at STACS’21 among many other venues.
Just because double-blind reviewing doesn’t solve all problems, doesn’t mean it isn’t the right thing to do to solve part of the problem, right?
I wasn’t aware of this study, and I agree it may be a valid argument. That said, relying on a biased system just to profit from some of the favorable biases doesn’t seem ideal. If there is a goal to judge outsider papers more favorably, having deliberate efforts in this direction (special tracks, quotas, or adding this information as input at a later reviewing stage) would seem like a better idea.
The solution is simple: impose double-blind reviewing to all submissions at a venue to ensure that they are treated equally.
I agree that there is a huge problem of the reviewing culture being harsh and unwelcoming, especially to newcomers, and especially to people from underrepresented groups who do not feel legitimate in academia. It is urgent to fix this problem, but it has nothing to do with double-blind reviewing; even if double-blind reviewing removes the mitigating factor where reviewers will be kinder with people that they know.
This is far from being the only justification. Fighting bias is a much better justification for double-blind reviewing. Even in a system which wants to publish all minimally interesting papers, biases can always mean, e.g., that famous authors will always get a free pass because reviewers trust them, or outsider authors will attract more scrutiny.
I completely agree. On the second point, see also: https://games-automata-play.github.io/blog/confVSjournal/.
The way I see it, light double-blind reviewing across an entire venue is a very simple solution, which we can expect to help avoid measurable biases, and is trivial to implement. Of course, it doesn’t solve all of academia’s numerous other problems, but that’s not a valid argument against it, I believe. Also, there are also many people who mean something different and less convenient when they say “double-blind reviewing”, so indeed one has to be careful to distinguish the different possible implementations.
For the kind of double-blind reviewing I’m defending, the only argument that I personally found convincing in your post is the one about outsiders faring less well. But let’s turn the system around: if the standard were for all venues to practice double-blind reviewing, and we wanted to bias the system towards accepting more papers by outsiders (which may indeed be a very reasonable kind of bias to introduce), would the right solution be to completely disclose the full author names and affiliations to reviewers on papers from the very beginning, and hope that they’d implicitly factor it in, precisely how we’d like? This wouldn’t be what we’d do, right?
I understand the argument that, in fully open scholarship, double-blind peer reviewing may be impractical and no longer desirable. But even if we’re very optimistic about academic practices evolving, the process of reviewing papers for “acceptance” in some sense is still going to stay for a long time I believe: even with open reviews à la OpenReview.net, even with epijournals, even with more welcoming reviews and more reasonable acceptance rates, etc. Even if we finally move to a fully open system with open platforms having completely overthrown traditional conferences and journals, you’ll probably want to keep a system to have people vouch for the correctness and interest of papers, or to give awards to the very best papers. And for such systems, which is what reviewing does nowadays (admittedly in an imperfect, harsh, and excessively selective fashion), it probably makes sense to hide the identity of reviewers and of authors — to avoid bias and because it’s completely irrelevant information that’s really not complicated to remove.
Thanks for the comment.
In the context of my post, open is by opposition to double-blind, and single-blind.
I do not view double-blind as a partial fix. I view it as a way to justify the continued existence of an elitist system. A conference like NeurIPS has a double-blind peer review system. The bulk of the papers are from a small set of elite institutions. Even if you go toward the very end of the list, to the institutions with very few papers, you are still in elite territory. You will not find anyone from Senegal or from a small college. The board is made almost entirely with people from elite institutions. But they are very inclusive, aren’t they, because they use double-blind peer review? No. they are not inclusive. They are elitist.
I believe you will find that biases under a system like PLoS One are much less of a concern. Once you lower the stakes for peer review, you have less concerns about biases.
The WSDM 2017 study is one data point and not representative of the literature. We still agree that homophily and prestige biases are very real: the evidence is overwhelming, but it does not follow that blinding will make things better.
Let me make this comment more precise. I do not mean that the people publishing at these venues are elitist. I mean that the venue itself is elitist. That is, it is an elitist institution. The people in it may not be.
My point is that it is very hard to break in if you are not already an insider (part of an associated elite institution).
(I chose NeurIPS deliberately because I have no relation with it. I should disclose that I have published papers at similar venues.)
Thanks for your answers. I completely agree that reviews like NeurIPS and others are elitist in the way you describe. That said, I’m afraid they will remain a defining part of our work and of research for the years to come, with researchers like you and me perpetuating this system by submitting our work there, reviewing there, organizing them, etc., and being evaluated based on this.
Knowing this, and while campaigning to fix the broader problem that this system is elitist and generally broken, I do think it makes sense to wonder what’s the best way to make the system work better (or probably better) with trivial adjustments. I doubt people seriously believe that NeurIPS and others are not elitist at all just because they are double-blind — and if some people think that, this is the problem, not double-blind reviewing itself.
Choosing single-blind or double-blind is one such simple adjustment. And it doesn’t seem plausible to me that reverting to single-blind reviewing and adding back unfiltered author information to NeurIPS reviewing would give a better outcome, as opposed to other more deliberate solutions like quotas or using author information after scientific reviewing has been done. And of course single-blind venues comparable to NeurIPS are also elitist.
The title of your post is “Double-blind peer review is a bad idea”. I’d agree that it’s not a perfect idea, or not going far enough, or maybe not so useful, or not addressing the right problem. But I fail to see why existing venues shouldn’t take the trivial step of switching to it right now, or why existing double-blind venues would benefit from reverting to single-blind.
But I fail to see why existing venues shouldn’t take the trivial step of switching to it right now
The mere fact that we recognize a problem, and that there is some action related to the problem, does not imply that we must proceed with that action. Our tendency to do so relies on a fallacy known as the politician’s syllogism.
This comment made me think about the publication fees, which can be a barrier for some authors. I looked on the PLoS One site, and they have addressed this issue very thoroughly:
https://plos.org/publish/fees/
Another reason to support this model of publication.
@Peter
To be fair, since I was comparing NeurIPS to PLoS One, it seems like PLoS One is much more expensive. NeurIPS 2020 was very cheap (100$).
Isn’t that due to Covid, forcing conferences to go virtual?
Yes. The cost of attending a conference in person is typically much higher than publishing in a journal.
Of course, nothing beats posting the paper online but I want people paid to maintain good journals.
It needn’t be the authors who pay to maintain good journals. Some open-access journals are free to both authors and readers, with the hosting costs paid by universities, public research institutes, or sponsors. LMCS in my field is one such example, and there are of course other free services for related things, like arXiv. The hosting cost of a platform running a Web app to manage reviewing plus serve some static PDFs is simply not that high.
I don’t think the author-pays model, with APCs over $1000 like PLOS does, is the right model, even when trying to plug the gaps with exemptions for underrepresented countries. (Compare this to LIPIcs’s 60 EUR per paper fee.)
But indeed this criticism about publication costs also applies to essentially all pre-COVID conferences.
Also, publication in PLOS One can be free, for authors who may be unable to afford the fees. See https://plos.org/publish/fees/
You are correct. Of course.
It is really an admirable initiative. It seems that the whole PLoS institution was well thought out.
I think that 1k USD is better than Nature’s 9.5k USD, but still quite a lot for an open access.
For example, JMLR is free and created as free alternative to paid journals.
https://blogs.harvard.edu/pamphlet/2012/03/06/an-efficient-journal/
On the other hand, it has quite low acceptance rate, do cannot be directly compared to PLoS One
This is one of the worst-argued essays I’ve ever read.
The bias within peer review tends to be on the basis of institutional prestige and fame – i.e. a paper from Deepmind is probably better than a paper from a school in Nova Scotia. I think that it would be very rare for a reviewer to find the name of an author, search around to figure out their race/gender identity, and then have such indiscriminate rage at a stranger that they’d try to get their paper rejected. Some fraction of the time, a person’s name tells you their ethnicity, so maybe some dumb Hong Kong supremacist protestor would try to get a Chinese person’s paper rejected. However most of these dumb hong kong protestors are too focused on terrorism for research.
The institutional bias is a serious issue, because a lot of junior reviewers will be biased by seeing a famous name.
The institutional bias is a serious issue, because a lot of junior reviewers will be biased by seeing a famous name.
The prestige bias is absolutely real. It does not follow that the cure is double-blind peer review.