Reviewing and Revamping the Double-blind Peer Review Process
Skip other details (including permanent urls, DOI, citation information)
This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact firstname.lastname@example.org for more information. :
For more information, read Michigan Publishing's access and usage policy.
This paper was refereed by the Journal of Electronic Publishing's peer reviewers.
The traditional double-blind peer review process currently used to determine which articles are published in scientific journals is far from perfect. This article argues that the Internet can provide us with a better way to judge article quality using the opinion of every reader rather than that of only a couple of reviewers. The article offers a relatively simple business model that can provide funding for such a publishing system. The model contains three basic components: a reviewing component, a submission/cost component, and a distribution component. The reviewing component will be an electronic market through which quality feedback can be bought and sold. The submission/cost component will decide if an article should be published and distributed electronically on-line using volume forecasts by experienced forecasters who will be compensated according to the accuracy of their forecasts. Articles can be bought by anyone in the scientific or general community from the distribution component. Although not perfect, this proposal possesses many features that could be valuable to the scientific community.
About 350 years ago scientists introduced the double-blind peer review process, a process in which neither the author nor the reviewer of an article has access to each other’s identity. The goal was to avoid the inevitable bias that comes when authors and reviewers know each other.
Unfortunately, that process did not stamp out bias nor ensure quality. It has been shown that when the article supports the reviewer’s prior research, the reviewer evaluates it higher on average (and vice versa), so the conflict of interest is still not completely eliminated (see Back 2000 and Harnad 1979). Moreover, the personal opinions of several reviewers are not a very good predictor of the value of the article to the whole scientific community. Often, articles accepted in reviewed journals are not very representative of the direction in which a certain field is going (see The Economist 1996 and Starbuck 2004). To improve the review process we need to pick reviewers whose opinions are impartial and highly representative of those held by the general community within the discipline.
Another problem with double-blind peer review is that it requires so much effort from editors and reviewers. Currently the complex task of picking and rewarding reviewers is left to the journal editors, who must themselves be impartial and highly representative of the general community. In some disciplines good reviewers are sometimes invited to become associate editors, which are prestigious positions. This reward may tempt some reviewers to provide reviews that match the editors’ expectations.
Reviewers, editors, and the scientific community as a whole face a challenging task in this regard. Scientific rigor requires time. The judgment of whether a certain theory is right or wrong is not easy. It often takes years to establish the validity of a newly proposed theory. But that does not mean a review should take years. Editors and reviewers are busy with other valuable scholarly activities including research and teaching. They are not directly compensated for the time and effort devoted to reviewing. Since there is not enough direct incentive to reply quickly, decisions and evaluations take a long time. Even an electronic journal with a median turn-around time of 55 days still has room for improvement.
Universities, too, have a stake in quality reviews, because they use publication as a measure of a researcher’s worth. Most universities judge the prestige of the journal in which an article appears as a rough measure of an article’s quality, and that prestige is usually evaluated through surveys or through the ISI impact factor. However, these measures are not necessarily accurate for any particular article because article quality can vary widely within a journal, and there can be bad articles in good journals. Moreover, journal rankings based on faculty surveys are subjective, vary through time, and can be misleading (Chua, Cao, et al. 2002). The ISI impact factor measures citations to determine the quality of an article and, indirectly, the journal (see Katerattanakul, Han, et al. 2003). But individual article citations are not perfect because they are not quality controlled. Authors might cite the work of friends and colleagues as a professional or personal favor, raising the specter of conflict of interest. Therefore currently there is still no fair way of evaluating an article even after the article is published.
Several scientists have suggested that the article publication process be purely democratic and that the final quality judgment should be left to all readers (see Nadasdy 1997, Rogers and Hurt 1990, Stodolsky 1990, and Varian 1998). According to some of these authors, double-blind peer review should be abolished completely and everyone should publish his or her article on-line without prior review or the imprimatur of a journal. The readers would then determine the quality after reading the available articles. An interesting alternative is described by Mizzaro (2003). He proposes a system in which all authors, readers, and papers receive a quality score and a steadiness score. The final value of these scores is calculated after all interested readers have read and evaluated all available articles. Using some assumptions about the computer simulations and the distribution of opinions, Mizzaro shows that the system has many desirable (and some undesirable) features. The evaluation process is more decentralized and participants who are more active get higher scores, but malicious and unexpected behaviors are nevertheless possible. While Mizzaro’s approach addresses the conflict of interest issue, it does not offer a sustainable business model for publishing and distributing articles.
We think that the Internet could radically improve the article publication and distribution process. Using the Internet, we can lift some of the burden off of the editors’ shoulders. We could ask reviewers to review articles for impact (expected number of downloads) and then let all readers evaluate the articles’ quality. This should be helpful because different people might have different criteria for quality. We can find a way to pick and reward quick, accurate, and constructive reviewers in a better and more direct way. We might even be able to use the communications possibilities of the Internet to help authors write, proofread, and submit their articles. The Internet has begun to question “blind” peer review anyway. Using Google or something like the Social Science Research Network (SSRN, http://www.ssrn.com/), an abstract-publishing Web site, a reviewer can often find a working paper similar to the one that she is reviewing and discover who the author is.
Based on the idea that the Internet can provide cheap and easy access to a huge number of articles, we propose the creation of a digital science library. This library would be part of a system that would give every reader the opportunity to rate an article's quality. The system would also provide financial incentives for authors who produce higher quality articles, for reviewers who give constructive feedback, and for forecasters (discussed below) who submit accurate forecasts. We offer a relatively simple business model that can provide funding for such a system. Our system has three basic components: a reviewing component, a submission component, and a distribution component. These components’ functions are shown in Figure 1. Essentially our system is similar to the one proposed by Mizzaro (2003), except it can provide multiple scores for various aspects of article quality, and it uses market forces to guarantee financial sustainability. A description of the three components follows:
The reviewing component will be a forum in which authors can solicit feedback from an array of reviewers. Reviews can be purchased from any of the registered reviewers. The price of the review will be a result of the bargaining process between authors and reviewers. The reviewer’s role in this market will be to offer constructive suggestions for improving the quality of an article. Traditionally, reviewers have volunteered their time as a service to the academic community. With this system, reviewers will be directly rewarded for their efforts. For the reviewer who feels a sense of commitment, monetary compensation might not be a motivating factor; however, it should not negatively impact the evaluation process. The compensation can be viewed as a bonus or a thank-you; it can even be donated to an institution or returned to the system. Chang & Ching-Chang (2001) have already shown that it is worthwhile to compensate reviewers for their effort, and some journals do that already. Another motivating factor will be that data on individual reviewers will be kept and displayed. This data can be used to create “best reviewer” lists and can serve to enhance individual reputation and status in the community. Services for editing papers could also be offered.
An essential part of the proposed electronic publication system is its submission module. The system, along with its market mechanism, plays the role of an editor.
Authors start the process by submitting articles for evaluation. Every article submission is accompanied by a submission fee and must contain a collection of valid subject headings. A valid subject heading is one that is listed as an area of interest by at least three forecasters in their on-line profile. An author may also propose a new subject heading, which can be validated after seven forecasters add it to their profiles or express willingness to evaluate the article.
The submission module depends on a special category of reviewers, whom we call “forecasters.” Forecasters make predictions about how often an article will be downloaded. These forecasters have on-line profiles that include affiliations and subject-area interests. After an article has been successfully submitted, the system compiles a list of all forecasters who are suitable to read it. Forecasters will compete for a portion of the bid. Experimental market literature suggests that having more than five participants in a market drastically reduces the likelihood of cooperative gaming (Ketcham, Smith, and Williams 1984). For this reason, the system chooses seven forecasters for each article. If more than seven members are highly suitable, members who are currently in process of reading an article are not eligible. The system will randomly distribute the article to any seven of the remaining suitable forecasters. Forecaster suitability is determined by the number of matches between the subject areas of the article and the subject areas residing in the electronic profile of a forecaster. Only the first three forecasters to complete an evaluation report will be eligible for a monetary compensation from the author so that forecast speed is encouraged.
Using the forecasters’ predictions, the system automatically selects articles for acceptance or rejection. The system takes into consideration the forecasts of several forecasters and estimates the expected profit from the article. Forecasters are compensated by the author of the article in proportion to the accuracy of their predictions.
If the expected profit is negative and the author still wants to publish the article, he or she has to pay to cover the publishing costs. A fixed payment of a similar sort is not completely alien; some journals already levy page charges on authors (e.g. American Economic Review).
An important final feature of the system is a reader-feedback mechanism similar to the one proposed by Mizzaro (2003). Any time after a reader downloads an article, she or he will be allowed to post reviews, feedback, opinions, judgments, or comments. Users will also be allowed to score the quality of the article numerically. This will provide potential readers with additional data to make a download decision. It will also allow the community to see what others think about an article. This will potentially start discussions that will evolve into intellectual debate and, ultimately, quality research much like peer commentary (Harnad 1979). It will be easy to track the usage of an article by counting article downloads (Pernbeger 2004; Hitchcock and Bergmark, et al. 2002). A similar attempt to track article usage has been demonstrated recently by Lawrence Brown. He uses download statistics from the Social Science Research Network to evaluate article quality (Brown 2003).
Economists view product evaluations of the kind described above as “public goods,” which are usually underprovided in a pure market setting. The distribution module can use the quality feedback from a user to suggest articles in which the user might be interested. This will provide an incentive for all users to submit quality opinions and to express their opinion truthfully.
The system we propose does not rely on consistent government funding. It might even generate surplus if the cost of electronic publishing decreases and articles exceed their forecasted download volumes. The system provides some additional benefits that are listed in Table 1. The re-engineered review process described above is directly linked to the article distribution process. Articles are available for download from a central site immediately after the market mechanism accepts them.
The pay-per-article model described and tested in PEAK at the University of Michigan is the one we consider most suitable for the distribution of scientific articles (MacKie-Mason and Bonn, et al. 1999). Under this model every user paid the same fee per article in order to read it. The current Apple service iTunes sells information goods (songs) using the same principle (Taylor 2003). Since all publications will be available in an electronic format over the Internet, it will be easy to introduce a system in which consumers are charged by the article. We propose a few operational modifications that will enhance it’s the system’s performance: The user will be charged a fee only once to read or download the full article. All articles will be available at the same fee. A download is recorded the first time an individual downloads an article. Re-downloading has no effect on volume calculations and is not a monetary transaction.
The articles will not be organized by journals (since journals need not exist), but only by subject areas. Articles can also be filtered by keyword, standard text searches, and other data. Additionaly users will have the capability to search for or create custom bundles of articles by subject area, by predicted or realized volume, by author, or by other criteria. These bundles can be e-mailed directly to the “subscribers” of the system, or can be made available on-line on a personalized Web page. For example, a subscriber can request to receive abstracts via e-mail when any article is published with a keyword of “B2B markets” that has a volume estimate greater than 500, or when any article is published in the information science field for which the author paid less than $5.
Clearly the system cannot be introduced in practice without some guidance to its users. There should be an introductory period during which reviewers will not be held liable for their errors and bids will be open and adjustable. The system can also be operated in parallel with current practices. After the system is completely in place, “traditional” journals can cease to exist in their current form. Bundles will be created in a decentralized way by the users of the system, and not by editorial boards.
The proposed system will also provide some additional advantages to the scientific community. For example, article research quality can be compared easily across disciplines. An article on experimental economics could be compared to an article on experimental physics by considering their relative volume rank in the given subject. As mentioned earlier, universities use various methods to measure a professor’s research capabilities. The relative volume download could provide a way to compare quality across disciplines. The system supports different measurements of quality. All the data will be available for various decision-making groups to choose between or weigh as their policies dictate.
Another potential benefit of our system is having one central location for storing, quickly accessing, and assessing the quality of articles, creating the potential for a network effect. With the entire scientific community meeting online, leaving feedback, and discussing ideas, the quality and quantity of articles should increase.
When discussing similar online systems it is important to see if there are strong incentives to violate copyright law. Our system would, like a journal, own the copyright when an article is accepted for publication. If the reader’s download fee per article is too high, users of the system might have an incentive to make copies of the article and distribute them illegally. One way to solve the problem is for the system to check if the author of a subsequent article has lawfully obtained all the works cited in it. This can be done automatically before the article is given to the forecasters (assuming that articles are available only in this system).
The system we propose is not ideal. Streamlining the process of scientific publications will be controversial because the interests of many different parties are at stake. Following is a discussion of some of the disadvantages of the new system. They are also shown in Table 1.
Since there is much uncertainty surrounding article quality and it usually takes the author a long time to find an appropriate venue for the article to be published, some authors decide to “target” certain journals in advance to save time. Targeting journals will not be necessary under the new system. As long as the article subject headings are stated correctly, the system will be able to find the best matching reviewers for a submitted article. However, scientists who find the strategy of targeting journals valuable might be negatively affected.
Journals will not be able to serve as brand names. However, we think that it is more important to search directly for an article than to rely on the journal brand to provide the articles needed. If readers are allowed to customize the kinds of articles they receive, then every reader will be able to identify his or her own brand. Where society journals reflect the thinking of members as a natural community of scholars, the proposed system will support the formation of natural communities of scholars by encouraging post-publication comments from readers.
|Availability of various statistics and rankings that can be used as proxies for quality||No traditional journals, so authors cannot "target" journals|
|Easy on-line searches by subject, keyword, past download volume, expected download volume, forecaster name, forecaster ranking, number and name of reviewers, etc.||All published articles are in electronic format only|
|Financial incentives for reviewers to provide constructive feedback||May be expensive to publish articles of interest to a very small number of readers|
|Financial incentives for forecasters to provide accurate forecasts||Institutional subscriptions and special issues are not directly possible|
|Uniform submission procedures||Established scientific publishers, editors and reviewers might not want to release control of the publication process|
|Article quality comparison across different subject areas||Authors might not be willing to pay for publication|
|Immediate availability of accepted articles|
|Easy citation and usage tracking|
|Search and match incentive for providing a correct personal opinion of article quality|
A major current concern is that the tenure track committees at many universities do not consider publications in electronic journals comparable in quality to publications in traditional paper-based journals (Snyder 2001). We hope that attitudes have changed during the past four years; otherwise this could be a drawback.
Doing away with the traditional editor role can also be considered a disadvantage of the system because the change is expected to generate much opposition. Many members of the scientific community still think that the editor can perform well the hard tasks of selecting articles for publication and matching articles with appropriate reviewers. We do need some research in this area to estimate the trade-offs involved.
It might be the case that the proposed system undermines niche research areas by making it expensive to publish articles that are of interest to a small number of researchers. This problem could be addressed by scientific institutions that provide research grants. It is now a common practice to include a budget line within a grant that is used to cover expenses related to publication. As costs related to electronic publishing decrease, we expect this problem to disappear.
There are many issues related to the new system’s implementation that need to be considered and should be mentioned as limitations. It might be hard during the new system’s introductory period for forecasters to find a reliable way to estimate an article’s download volume. The proposed system could be introduced in such a way as to provide guidelines to forecasters when they are trying to estimate the download volume of an article. Some forecasters will excel in that aspect and others will not. Over time forecasters will gain experience to rely on when providing an estimate.
We have not discussed issues of security and privacy that are important to any on-line system containing personal information, such as unique login names and passwords. We leave this to future research efforts. We envision the system being run by a non-profit foundation like the federal National Science Foundation in the United States. The system is designed to cover its expenses, so potentially a search engine devoted to searching scholarly articles (like Google Scholar) might be a good private alternative.
The new digital scientific library will undoubtedly require some fine-tuning. For example, we need to determine how much time the volume forecast should reasonably cover and how many forecasters should be used to estimate the expected profit. Longer time spans and more forecasters will guarantee higher accuracy, but will also necessitate a longer waiting time before publication and, most likely, a higher charge to the author on average. The options are virtually limitless, but a careful exploration of the possibilities that technology offers can show us a better way to gather, store, evaluate, spread, and produce scientific knowledge.
This paper describes the most important features of an electronic system for review and distribution of scientific articles based on simple market principles. The system ensures that the electronic publication process is financially sustainable and provides financial incentives to forecasters to provide correct estimates of an article’s download volume. In addition the system provides a forum where authors can solicit constructive reviews and experienced reviewers can offer their services. The system should dramatically decrease the time between submission and publication because it also provides monetary incentives to forecasters and reviewers to be expedient. In addition, the proposed system delivers many statistics—including total download volume, download volume by scientific users, relative download volume, citation counts, and user ratings—that could be used by universities and other agencies as proxies for article quality. The system can also provide article, author, reviewer, and forecaster rankings across disciplines, a simple procedure to provide an outlet for new areas of research, and a mechanism to prevent copyright violations. Future research can focus on the procedures that will lead to a smooth transition from the current system to the new one. Although the financial incentives under the new system are aligned well with its goals, they might still not be enough to completely eliminate all potential conflicts of interest. In any case, however, improvement over the current process should be quite apparent.
The authors would like to thank the editor and the reviewers for their valuable comments and suggestions.
For more information see http://scientific.thomson.com/free/essays/journalcitationreports/impactfactor/.
Anonymous. “Reengineering Peer Review.” The Economist, June 22 ,1996, 78.
Back, L. “The Devil You Know: Academic Reviewers Should Not Be Anonymous.” The Guardian, July 18, 2000.
Brown, L. “Ranking Journals Using Social Science Research Network Downloads.” Review of Quantitative Finance and Accounting 20: 291-307.
Chang, J. and L. Ching-Chang. “Is it Worthwhile to Pay Referees?” Southern Economic Journal 68, no. 2 (Oct 2001): 457–64. [doi: 10.2307/1061605]
Chua, C., L. Cao, K. Cousins, and D. Straub. “Measuring Researcher-Production in Information Systems.” Journal of the Association for Information Systems 3: 145-215.
Harnad, S. “Creative Disagreement.” The Sciences 19 (1979): 18 – 20.
Hitchcock, S., D. Bergmark, T. Brody, C. Gutteridge, L. Carr, W. Hall, C. Lagoze, S. Harnad. “Open Citation Linking: The Way Forward.” D-Lib Magazine 8 (October 2002): 10. [doi: 10.1045/october2002-hitchcock]
Katerattanakul, P., B. Han. and S. Hong. “Objective Quality Rankings of Computing Journals.” Communications of the ACM 46, no. 10: 111-114. [doi: 10.1145/944217.944221]
Ketcham, J., V. Smith, and A. Williams. “A Comparison of Posted Offer and Double-Auction Pricing Institutions.” Review of Economic Studies 51, no. 4: 595-614. [doi: 10.2307/2297781]
Laband, D. “Is there Value-Added from the Review Process in Economics?: Preliminary Evidence from Authors.” The Quarterly Journal of Economics 105, no. 2 (May 1999): 341-353. [doi: 10.2307/2937790]
MacKie-Mason, J. K., M. S. Bonn, J. F. Riveros, and W. P. Lougee. "A Report on the PEAK Experiment: Usage and Economic Behavior." D-Lib Magazine 5, no. 7/8.
Mizzaro, St. “Quality Control in Scholarly Publishing: A New Proposal.” Journal of the American Society for Information Science and Technology 54, no. 11(September 2003): 989. [doi: 10.1002/asi.10296]
Nadazdy, Z. “A truly all-electronic journal: Let democracy replace peer review.” Journal of Electronic Publishing 3, no. 1 (Sept. 1997).
Perneger, T.V. “Relation between online ‘hit counts’ and subsequent citations: prospective study of research papers in the BMJ.” British Medical Journal 329 (September 2002): 546-547. [doi: 10.1136/bmj.329.7465.546]
Snyder, K. J. “Electronic Journals and the Future of Scholarly Communication.” Notes 58 no. 1 (2001): 34-39. [doi: 10.1353/not.2001.0170]
Starbuck, W. “Why I Stopped Trying to Understand the Real World.” Organization Studies 25, no. 7 (2004): 1233-1254. [doi: 10.1177/0170840604046361]
Stodolsky, D. “Consensus Journals: Invitational Journals Based on Peer Consensus.” Psycholoquy 1, no. 15.
Taylor, C., “The 99¢ Solution” Time. http://www.time.com/time/2003/inventions/invmusic.html. Accessed on 11/24/2003.
Varian, H. “Pricing Electronic Journals.” DLIB Magazine, June 1996.
Varian, H. “The Future of Electronic Journals.” Journal of Electronic Publishing 4, no. 1.
Weller, A. “Editorial Peer Review for Electronic Journals: Current Issues and Emerging Models.” Journal of the American Society for Information Science 51, no. 14: 1328. [doi: 10.1002/1097-4571(2000)9999:9999<::AID-ASI1049>3.0.CO;2-N]