Open-science and Double-blind Peer-Review

by Martin Monperrus

Recently, double-blind peer-review has fallen over my research community as a storm. Unfortunately, beyond its noble goal of reducing unfairness, double-blind peer-review may have detrimental collateral effects on open-science.

Open-science is a scientific movement aiming at improving the scientific process through openness, where openness mostly refers to transparency and freedom. Intuitively, transparency seems to be in opposition with blindness. While "double-blind open-science" is not a perfect oxymoron, there are still important issues to consider, which are discussed in this paper.

In this article, I argue that open-science must not suffer from double-blind peer-review. The main reason is that the timely dissemination of knowledge is paramount. In particular, I explain that the boundary of anonymization is paper plus its appendix (possibly online appendix). The boundary of "blindness" is not the whole world. Consequently, the reviewer is sole responsible for breaking double-blind as soon as she searches on Google, or anywhere else on the Internet.

The Risks of Double-blind Review for Open-science

Over the past years, I have witnessed two problems posed by double-blind review on open-science:

I discuss those two points in the following.

Anonymization of Open-science Data

For experimental disciplines, an open-science approach to submitting papers is to always attach the data or code that supports the claims in the paper. This has two main advantages: first, reviewers can complement their reviews of the paper by looking at the data, and they qualify their assessment by looking at whether the data is good enough for being used in future research. Second, if the assessment of the code/data is part of the review process, this makes a strong incentive for authors to create good reproduction package.

It is to be emphasized that double-blind does not prevent open-science, double-blind review does not mean absence of data or absence of appendix. Similarly to paper anonymization, in a double-blind review process, the open-science data or code that is in the online appendix must be anonymized. The authors must

Rule 1: Under double blind, put your data open for review, as with single-blind, but take care of anonymization

Anonymizing an open-science appendix needs some work, but fortunately, this can be automated, see "Github anonymous" below.

Reviewer Responsibility

Science is a conversation. Ideas flows. Internet is the most wonderful ever salon where scientific ideas, data and code are exchanged at a very high speed. Search engines are built for finding them super efficiently. Consequently, it is not surprising to be able to identify the authors of a paper by using search engines. There are many reasons for this: 1) the paper under review resembles previous ones by the same authors, 2) the paper under review has already been discussed during public outreach 3) a previous version of the paper has already been made online, eg a working paper on a webpage or on Arxiv (this often happens when a paper is a resubmission). In short, there is a high chance that a single query on Google (or Google Scholar) will reveal the identity of the authors. This is just fine. Authors can do nothing about this.

In double-blind peer-review, the boundary of anonymization is the paper plus its online appendix, and only this, it's not the whole world. Googling any part of the paper or the online appendix can be considered as deliberate attempt to break anonymity. This means that taking care of anonymity is not only on the author side, but also on the reviewer side. And both may be responsible for breaking double-blind.

Rule 2: The reviewer is sole responsible for breaking double-blind as soon as she searches for information about it, whether by asking to her colleagues, Google, or anywhere else on the Internet.

The authors are not responsible if it is possible to find their work, or traces of their work, or comments on their work on the Internet.

Double-blind and Preprints / Arxiv

It is often asked whether one can publish a work on Arxiv while it is under review in a double-blind process. There are different answers:

My own humble answer is YES, one can publish a work on Arxiv that is under double-blind review. Even more, one should really do so. First, because the essence of science is dissemination of knowledge, and this is all what a preprint is about. A preprint is a good starting point of a scientific conversation.

Second, because publishing a preprint is not only about dissemination, it is also about being able to claim precedence for a discovery, an idea or an invention. Double-blind peer-reviewers are not less likely to reuse or leak an idea, or to simply "be inspired" by your work. This holds whether your paper is accepted or not. You actually need a preprint backup much more if your paper is rejected...

Also, publishing on Arxiv is a way to have an early impact. It has happened several times to me that my work has been cited by papers in the same conference, because they were published on Arxiv, and consequently could be read and cited early.

Rule 3: Double blind allows you to publish on Arxiv as early as needed, and the incentives of early preprints -- dissemination, precedence -- are as clear and strong as with single-blind peer-review.

(Not to mention the case of papers that are resubmitted after one or several rejections, which should obviously be made public before the final acceptance...)

Automated Anonymization of Open-Science Github Repository

Github is now heavily used to host scientific code and data. The open-access platform Zenodo supported by CERN even attributes DOIs to Github releases! Now, let's come back to the idea that double-blind peer-review should not prevent open-science appendix. What does this mean for Github open-science repositories?

It means that for a Github replication repository, accompanying a paper under double-blind review;

Imagine how this is tedious, especially under the pressure of a deadline. Consider the tension between doing good open-science and keeping precious hours for doing something else than anonymizing an open-science Github repository.

The good news is that this can be automated! My talented student Thomas Durieux has developed anonymous_github which automatically anonymizes both the URL and the content of a Github repository. The anonymization of the URL is achieved though proxying the requests, and the anonymization of the content is done by replacing all occurrences of words in a list by "XXX". The word list is provided by the authors, and typically contains the institution name, author names, logins, etc...

A public instance of anonymous_github is hosted at 4open.science:

http://anonymous.4open.science/

To use it, on the main page, one simply fills the Github repo URL and the word list (which can be updated afterwards).

Other Aspects of Open-science and Peer-review

For an overview of open-science, we refer to (Fecher and Friesike 2014) who have drawn an overview of the different facets of open-science and in particular open infrastructure, open access, and open scientific data. However, they have not discussed the relationship between open-science and peer-review.

Beyond open scientific data and early dissemination with preprints, open-science has already met peer-review in a number of ways:

Murphy interestingly discusses peer-review of data (Murphy 2016). In the perspective of double-blind peer-review, this poses the interesting problem of anonymizing research data. While "Github anonymous" is a first step in this direction, there is certainly more to be done.

References

Carmi, Ran, and Christof Koch. 2007. “Improving Peer Review with CARMA.” Learned Publishing 20 (3). Wiley Online Library: 173–76.

Fecher, Benedikt, and Sascha Friesike. 2014. “Open Science: One Term, Five Schools of Thought.” In Opening Science, 17–47. Springer.

Murphy, Fiona. 2016. “An Update on Peer Review and Research Data.” Learned Publishing 29 (1). Wiley Online Library: 51–53.

Tagged as: