Peer Review: Reform and Renewal in Scientific Publishing

Adam Etkin; Thomas Gaston; Jason Roberts

doi:10.3998/mpub.9944026

Peer Review: Reform and Renewal in Scientific Publishing

Adam Etkin; Thomas Gaston; Jason Roberts

DOI: http://dx.doi.org/10.3998/mpub.9944026

Published by: United States of America: ATG LLC (Media), 2017.

Permissions: This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Please contact [email protected] to use this work in a way not covered by the license.

For more information, read Michigan Publishing's access and usage policy.

Table of Contents

Share+
- Twitter
- Facebook
- Reddit
- Mendeley

« Prev section Next section »

‹ ›

History of Peer Review

Early Peer Review

To understand how we have arrived at a self-regulating system that simultaneously validates results, theories, and opinions; offers suggestions for improving what is published; and ultimately determines what gets published within a given field’s definitive body of literature, we need to first consider the history of peer review. It is a history that is both old (in its inception) and, perhaps surprisingly, somewhat modern in its execution, with many journals and periodicals formally instituting a structure of review that calls on acknowledged experts in the field to assess a manuscript submission only in the last five decades.

In 1662, the Royal Society of London was founded by a cadre of curious men who were dedicated both to science and to the acquaintance of like-minded scientific thinkers. A contemporary planning document from 1660 had called for the formation of a “‘College for the Promoting of Physico-Mathematical Experimental Learning,’ which would meet weekly to discuss science and run experiments” (“Prince of Wales”), and this gives us a solid sense of the intellectual concerns of those founding members. The men of the Royal Society were dedicated not only to meeting in person but also to the communication of scientific knowledge more broadly, and in 1665, they founded the journal Philosophical Transactions, usually regarded as the earliest academic journal, with this aim in mind. This earliest society journal’s full title was Philosophical Transactions, Giving some Account of the present Undertakings, Studies, and Labours of the Ingenious in many considerable parts of the World, which emphasizes that its mission was not limited to the members residing in London. Gradually, similar-minded publications began to emerge. In 1699, for example, the Académie Royale des Sciences of Paris was founded with similar aims, and this society also created its own publication, the Journal des Sçavans.

Today we might ask whether Philosophical Transactions was peer-reviewed on the assumption that peer review is the process that makes a journal scientific or results and theories valid rather than a matter of opinion or conjecture. And indeed, peer review—or the collegial process whereby scholars evaluate research papers independently before publication is granted in a particular journal—is commonly assumed to have originated with the emergence of these academic societies and the scientific journals they founded (Hames; for more on the history of peer review, see Fyfe; Kronick; Fitzpatrick). Allowing its broadest definition, “peer review can be said to have existed ever since people began to identify and communicate what they thought was new knowledge” (Kronick 1321). Even prior to the formation of the learned societies, early scientists corresponded among their peers with the intention of soliciting comments and critique on their work. Philosophical Transactions and its ilk sought to improve and, to a certain degree, replace this previous practice of reporting on scientific observations and discoveries by personal correspondence. Yet while an element of selection (in the form of acceptance or rejection) was implicit in the operation of these early scientific journals, it would be far from accurate to equate that selection process with peer review as we know it today.

The founding editor of Philosophical Transactions, Henry Oldenburg, actively gathered content for the journal from contemporaneously published pamphlets, from meetings of the Royal Society, and from his own correspondence. Unlike modern editors, he was not deluged by unsolicited submissions and his role did not require him to prioritize articles for impact or originality. There was a certain level of scrutiny of some of the scientific reports, inasmuch as research findings were presented and discussed at meetings of the Royal Society. But it is somewhat inaccurate to describe material published in Philosophical Transactions as peer-reviewed in the modern sense of the term.

Later journals would come to adopt broader peer review practices, such as the creation of an editorial team at the Journal des Sçavans (as opposed to a solitary editor making decisions and more akin to the currently understood notion of an editorial board). This step, instituted after 1701, was rendered necessary by the breadth of fields covered in that publication. Later still, the Royal Society of Edinburgh began to solicit reviews from knowledgeable members for contributions to Medical Essays and Observations (1731). The preface to that journal stated, “Memoirs sent by correspondence are distributed according to the subject matter to those members who are most versed in these matters. The report of their identity is not known to the author.” Not only is this an early description of a peer review process; it represents one of the first declarations of what we would call a “blinded peer review” approach, whereby the author’s identity would remain obscured to the reviewer, presumably on the grounds of ensuring a fair and unbiased peer review.

In 1752, the Royal Society of London appointed the Committee of Papers to evaluate potential contributions. Similarly, the Philosophical Magazine, founded in 1798, originally had a single editor but by the 1850s was run by a five-man editorial team to ensure comprehensive coverage of various disciplines. However, none of these arrangements constituted what we understand as “peer review” today—that is, systematic procedures for independent review by experts in the field. The Royal Society’s Committee of Papers sought to ensure some level of control by the society regarding what appeared in the pages of Philosophical Transactions. The committee format was intended to remove the onus of selection from a single individual and limit the danger of bias or prejudice.

Yet the Royal Society recognized that its committee could not guarantee the validity of everything Philosophical Transactions published, and the journal carried a statement in each issue to the effect that “the certainty of the facts, or propriety of the reasonings . . . must still rest on the credit or judgment of their respective authors.” It is interesting to note that the Académie Royale des Sciences in Paris did attempt to test the validity of research findings. The organization had a remit from the Crown to assess the merits of inventions and discoveries, and the committees appointed to investigate these sought to replicate and test research findings. This level of validation went far beyond what is expected (or possible) with modern peer review, and by the 1830s, this process had been abandoned as being too time-consuming (Fyfe). Interestingly, in clinical medical publishing at least, there is now a movement back toward reproducing results, principally through the provision and sharing of data and the drive to improve the reporting of methods. Nevertheless, this is still some way short of what was undertaken by the Académie Royale des Sciences.

The next stage in the evolution of peer review was the move from in-house editorial panels or committees to truly independent reports on submissions. This practice emerged among several learned societies in the early 19th century. In 1831, the Royal Society experimented with jointly authored reports—a precursor of modern ideas about collaborative peer review—but from 1832 onward, the Committee of Papers was soliciting independently written reports (Fyfe). This process quickly became part of the standard procedure for publication at learned societies. The collaborative aspect of peer review arose again when George Gabriel Stokes was secretary of the Royal Society (1854–85). Stokes gave considerable attention to mediation of the discourse between reviewers and authors to improve the text. Simultaneously, Ernest Hart, the editor of the British Medical Journal (BMJ; 1868–98), used a similar model of peer review and extolled its virtues, though complaining of the effort it required.

Yet despite these early examples of the independent review of papers, many journals in the 19th century ignored independent review altogether, and peer review as we understand it today was far from routine until as recent as the mid-20th century. Thomas Wakley, for example, who founded the Lancet in 1823, had little appreciation of review or incentive to use peer review, not least because he wrote much of the content himself (Burnham). Physical Review introduced a peer-review process in the early 1930s, but the process was mostly employed when the editor required a second opinion. Neither Science nor the Journal of the American Medical Association (JAMA) used outside reviewers until the 1940s (Fitzpatrick). The founding editor of Nature, Norman Lockyer, made most of the editorial decisions himself, only seeking additional opinions from his contacts when necessary. Nature, in fact, did not adopt a formal peer-review process until 1967.

Illustrative of the exception rather than the rule of utilizing external expert opinion, it is interesting to note that Albert Einstein published more than 300 articles between 1901 and 1955, but it is likely that only one of those was ever subject to peer review (Kennefick). Einstein sent a letter to the editor of a journal taking umbrage with the fact that his work had been subjected to review by an unknown scholar, a process he seemed to regard as rather underhanded:

Dear Sir,

We (Mr. Rosen and I) had sent you our manuscript for publication and had not authorized you to show it to specialists before it is printed. I see no reason to address the—in any case erroneous—comments of your anonymous expert. On the basis of this incident I prefer to publish the paper elsewhere.

Respectfully,

P.S. Mr. Rosen, who has left for the Soviet Union, has authorized me to represent him in this matter.

Tellingly, however, it should be noted that Einstein went on to publish the same paper in another journal, incorporating revisions based on the comments by the original reviewer. Some feel that these revisions might have saved him from public embarrassment and evidently show the value of peer-based critique, even if that usefulness was perhaps not appreciated.

It is significant to note that according to the Oxford English Dictionary, the term “peer review” was not used in print until 1967 (Fyfe). The frequent use of the term from the 1970s onward parallels the widespread use of peer review, both for nonsociety journals and for research grant applications. Since then, peer review has become standard practice across scholarly publishing and the “peer-reviewed” label has become a hallmark of genuine science and knowledge.

Why Did Peer Review Emerge?

It is quite evident that “peer review, the process by which material submitted for publication is critically assessed by external experts . . . was introduced into different journals at different times and in different ways, often dependent on the chief editor at the time” (Hames). There had been little incentive for early journal editors to adopt peer review because, as noted previously, they often gathered, and even wrote, much of the content themselves. There might also have been an element of editorial pride involved: editors were often loath to admit that they required the opinions of others to determine what to publish (Burnham).

The development of peer review has also mirrored the evolution of scholarly journals. Early journals were often the communication vehicles of particular learned societies, employed to disseminate news and information relevant to members of the organization. Members of societies often asserted their expectation that they would be published in their society’s journal. Frequently journals were seen more as media for mass communication than as arbiters of accuracy in science (Nielsen). Other journals were personal organs of their editors or were institutional proceedings to publish the research undertaken at a given research establishment. Editors often viewed their primary role as educators—as disseminators of information—rather than as guardians of the scientific literature. There was, therefore, less expectation for the independence of the journal’s content.

Indeed, not only were many early journals not peer reviewed, but they made no claim to have either verified or authenticated the research they published. Denis de Sallo, the first editor of Journal des Sçavans, wrote in 1699, “We aim to report the ideas of others without guaranteeing them” (Rennie, “Editorial Peer Review” 2). The Royal Society of Edinburgh issued the following statement concerning its publications:

The sanction which the Society gives to the work now published under its auspices, extends only to the novelty, ingenuity or importance of the several memoirs which it contains.

Responsibility concerning the truth of facts, the soundness of reasoning, in the accuracy of calculations is wholly disclaimed: and must rest alone, on the knowledge, judgement, or ability of the authors who have respectfully furnished such communications.

Similarly, the Literary and Philosophical Society of Manchester, while not depending on a single editor to determine what was to be published, noted in 1785 that “a majority of votes, delivered by ballot, is not an infallible test in literary or philosophical productions” (Rennie, “Editorial Peer Review” 2). These examples illustrate that even where editors were employing committees, editorial panels, or external reviewers, it was primarily to assist with the selection of manuscripts, not to endorse what was being published.

So while these early journal editors had little incentive to be selective, by the mid-20th century, as the flow of unsolicited articles grew, the need to filter submissions increased as journals, particularly in the predigital age, had to deal with the cost limitations determining how much material could be published. Quality thresholds emerged in response. Those journals seeking to set a high standard for quality research in their fields enforced low acceptance rates to maintain the perceived quality of their content. To achieve such an objective, journals quickly began to rely on reviewers to help identify the small proportion of submissions they preferred to publish (Hames; Burnham; Nielsen).

Beyond the shift toward journals offering validation through publication and some journals taking the lead on delineating and imposing quality thresholds, another factor to consider in the development of peer review is the fact that increased scientific specialization, and the associated increased depth of accumulated knowledge, has made it impossible for a single editor to master all areas of one field. Even with the contemporary expansion of specialized journal publishing, with the publication of journals dedicated to subdisciplines, it is still difficult for a journal editor to be suitably competent to review all submissions. Inevitably, editors have been compelled to reach out to those more qualified to assess the quality of what has been written.

We can also see that changes in technology have made systematic peer review feasible. Prior to the development of carbon paper in the 1890s, and later photocopies in 1959, it was very labor intensive to produce multiple copies of manuscripts to send out for independent review. It is these changes in technology—culminating in the Internet and modern online submission and peer-review systems such as Editorial Manager, ScholarOne, and eJournalPress—that have made peer review by multiple experts both practical and obtainable to journals of all sizes.

It is also clear that changes in the nature of research have required a shift in the way content is selected for publication. Medical publication, for example, has moved from disseminating case history to reporting research conducted by randomized trials (Rennie). As research methodologies and study designs have become ever more technical, the need for specialist assessment has increased.

Fyfe (2015) writes, “The various research teams looking into the history of peer review, including my own, do not yet know enough about why the post-war expansion of scientific research, on both sides of the Atlantic, led to the transformation of refereeing into ‘peer review,’ or why it then came to dominate the evaluation of scholarly research.” Rennie (1999) cites Robin Fox, who, recalling when Ian Munro took over as editor of the Lancet in 1976, wrote, “Doctors were becoming reluctant even to cast an eye on research papers that did not bear the ‘pass’ sticker of peer review” (Rennie). This perception of peer review as the means to distinguish which published material is worthy of consideration dramatically changed the impetus to peer review. It became nearly impossible to be taken seriously as an academic publication without the hallmark of peer review, and thus operating peer review became a reputational and commercial necessity.

This impetus was also increased by the criteria used by institutions to assess potential applicants to academic positions and by funding bodies to award research grants. The motto “publish or perish” for postdocs seeking their first appointment refers only to publication in peer-reviewed journals. Similarly, the assessments by funding bodies, such as the Research Excellence Framework, currently only consider articles published in peer-reviewed journals. Such developments have enforced the ubiquity of peer review in academic publishing but were perhaps secondary to the widespread adoption of peer review.

Interestingly, after reaching a point whereby journals have come to be seen as arbiters of quality and validation through the imposition of peer review, the now almost reflexive notion of journals providing a seal of approval has been recently questioned on two fronts. The first concerns examples of utterly inadequate, or perfunctory, peer review coming to light. The second concerns the rapid proliferation of opportunistic—and almost always exploitative—publications called “predatory journals,” titles that give the appearance of offering peer review but in fact do nothing of the sort. These issues we will return to later in discussing the current challenges confronting modern peer review, but they are illustrative of the fact that the relationship between journals and peer review continues to evolve.

« Prev section Next section »

Top of page