16th April 2007, 4 min read

Are we destroying research by evaluating it?

This morning, I read a fascinating paper, Evaluations: Hidden Costs, Questionable Benefits, and Superior Alternatives by Bruno S. Frey and Margit Osterloh (October 2006). This paper is concerned with the undesirable effects of the focus on bibliometric indicators (“publish or perish”). In many context, it is very difficult for a researcher to land a job, or to keep the said job, or the necessary funding, unless he publishes regularly in prestigious venues. Intuitively, these measures would insure that research is of higher quality. Is that really so?

Their main point is that such (rigid) evaluations distort incentives for researchers.

The measurement exerts not only pressure to produce predictable but unexciting research outcomes that can be published quickly. More importantly, path-breaking contributions are exactly those at variance with accepted criteria. Indeed innovative research creates novel criteria which before were unknown or disregarded. The referee process, by necessity based on the opinions of average peers finds it difficult to appreciate creative and unorthodox contributions.

They argue that we see a homogenization of research endeavors. All laboratories and departments end up looking the same. Fads are followed religiously.

They argue that this disconnects researchers from the real world:

Research departments give no credit to faculties who write books and magazine articles designed to intermediate between the research community and the general public because they don’t contribute to the citation record. As a consequence, the gap between rigor and relevance of research is deepened and the dialogue between science and practice is undermined.

I often complain about this very fact on this blog. But I especially like this bit:

The tendency to measure research performance by the size of grants received creates an incentive to undertake more expensive, rather than relevant research.

I find the paper really fascinating. They go on to say that researchers who act as reviewers have a incentive to rate poorly competitors or potential competitors. If everyone gets to review in an open way, I guess this incentive is small, but in the current setup where few people get to kill or allocated most grants and paper acceptance, there is a real worry that, without ever realizing it, they may seriously hurt potential competitors only to protect their own interests. Ah! But the people getting reviewed can play games as well. For example, one can always create new metrics against which one performs well. That’s ok, but then, it can get uglier:

Authors raise their number of publications by dividing their research results to a â€œleast publishable unitâ€, slicing them up as thin as salami and submitting them to different journals. Authors may also offer to include another scholar among the authors in exchange for being put as co-authors on his or her paper. Time and energy is wasted by trying to influence editors by courting them e.g. by unnecessarily citing them. More serious are manipulations of data and results.

There is the nice concept of academic prostitution (!!!) which relates to the way in which authors are inclined to change their papers to favor the reviewers, by citing them, for example. This is exemplified by this quote:

According to the study by Simkin and Roychowdhury (2003) on average only twenty percent of cited papers were ever read by the citing authors.

They explain that the system is self-sustained because anyone who questions it is then suspected of not meeting the required quality standards. The system locks people in.

What do they propose as an alternative? In short that you should choose the right people, coach them adequately, and then leave them alone. They also correctly point out that the Web makes it less important to cluster good researchers together. I really like this concluding remark:

The characteristic of a selection system is that once a decision has been made the principals put faith in the persons selected. Important positions in society (such as top judges and presidents of Central Banks) are elected either for life or for a very long time period without formal evaluations for good reasons. It is questionable why these reasons should not apply to research.

Update. Peter Turney sent me this pointer to Reviewing the Reviewers by Kenneth Church. In it, Church argues that by selecting fewer and fewer researchers and papers, we are discarding many interesting papers because they are not conventional enough.

Update 2. My own point of view is that we should mimick the Atmospheric Chemistry and Physics journal and move to open peer review (as described in this Nature article.) I am not quite certain how this would work with grant and job reviews, but I think we must move toward more modern systems. I think we overestimate people’s fear toward transparent review systems. After all, if I am ever to be convicted of a crime, I expect to see the juries face to face.