Peer Review: Reform and Renewal in Scientific Publishing

Adam Etkin; Thomas Gaston; Jason Roberts

doi:10.3998/mpub.9944026

Peer Review: Reform and Renewal in Scientific Publishing

Adam Etkin; Thomas Gaston; Jason Roberts

DOI: http://dx.doi.org/10.3998/mpub.9944026

Published by: United States of America: ATG LLC (Media), 2017.

Permissions: This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License. Please contact [email protected] to use this work in a way not covered by the license.

For more information, read Michigan Publishing's access and usage policy.

Table of Contents

Share+
- Twitter
- Facebook
- Reddit
- Mendeley

« Prev section Next section »

‹ ›

Challenges Facing Peer Review

Inevitably, a system that has evolved to serve the world’s most inquisitive, informed, and opinionated minds will attract its fair share of scrutiny and criticism. What is so utterly surprising, however, is how little of that comment is actually scientifically validated, based on methodologically rigorous study. Thankfully, research into all aspects of peer review is gradually increasing, though researchers still probably do much of it in their spare time rather than as the primary focus of their careers. There is now a successful quadrennial International Peer Review Congress and a journal (Research Integrity and Peer Review) with a specific interest in disseminating research into peer review. The breadth of research output is also broadening from the first works produced, which focused primarily on the mechanism and management of peer review. As Rennie and Flanagin noted, the most compelling issues for study (which also reflect some of the most vocal of ongoing debates surrounding peer review) include the quality of reporting in peer review and publication, transparency and openness, and the need for tools to detect what is real from what is fake (fake research and results, fake peer reviewers, fake journals). To close out this book, it is perhaps worth dwelling on some of the current debates surrounding peer review. Though there is a modernity to the discussion, a result of the disruptive use of technology in peer review, the underlying current to all this chatter remains the same: Should researchers self-regulate, can they do it effectively, and what are the consequences when peer review fails?

Making Peer Review More Scientific

Peer review has been accused of being secretive, open to bias and personal manipulation, and ultimately, an amateurish endeavor (Rennie). The consequences of this are clear (incorrect, invalidated research published; misdirection of future research; potentially negative consequences/harms for patients) and have been discussed elsewhere in this book.

With little evidence to work from, editors—who often ascend to their positions based on their prominence in a given field rather than on the accumulation of years of experience working on journal editorial boards or following specific training—often operate their peer review processes based on anecdote, casual observation, and expediency. As a consequence, what often suffers is the quality of the research published. No wonder John Ioannidis (2005) declared, “Most published research findings are false.” This is not to ascribe blame solely to editors. Few are dedicated professionals. Most undertake their work part time. The overwhelming majority operate isolated from outcomes reported in high-level studies into peer review itself and, as a consequence, remain blissfully unaware of data that increasingly show where problems and deficiencies in the peer-review process lie.

In biomedicine at least, perhaps the single most significant debate is now around the issues of poorly reported research, which ultimately frustrates efforts to fully understand whether results presented are real, correct, and free of bias (Moher). Peer review, quite frankly, is failing to detect such problems with consistency (and in the case of a large swathe of titles, failing to undertake any effort to improve reporting standards at all). And here lies possibly one of the biggest failures of peer review: a disappointingly high number of reviewers/journals are not validating results because they simply are not asking authors to better reveal what they did and explain what information/data they included or excluded from the study write-up that constitutes their journal article. In short, if we can’t tell exactly what the researchers did, how can we tell if their results are accurate and meaningful?

Ideally, peer review would be better equipped to watch for these issues. That is not to say the tools to better challenge prospective content submitted to journals do not exist. As it approaches 20 years since the publication of the CONSORT Statement (Consolidated Standards of Reporting Trials; Begg et al.), the first and most important step toward improving the quality of reporting (in this case, for randomized controlled trials), it is still shocking to see that only approximately 600 out of potentially thousands of biomedical journals have bothered to endorse CONSORT, the most well-known reporting guideline produced to date, and execute its requirement that authors better describe their methods. Indeed, not only is there an abject failure on the part of thousands of journals to impose any sort of minimum standards; there seems to be widespread lack of understanding of why journals need to be concerned in the first place.

Journals that do utilize reporting guidelines like CONSORT typically demand that authors improve their reporting either ahead of submission or during the manuscript revision phase. As evidence of the presence of essential reporting elements, many journals ask that authors include a completed reporting guideline checklist with their submissions. CONSORT, for example, uses a 27-point checklist (see http://www.consort-statement.org).

So with a tool such as CONSORT available—which is one of many other reporting guidelines, each designed for different study types and curated by an organization called the Enhancing the Quality and Transparency of Health Research (EQUATOR) Network—and despite repeated examples of poor research slipping past the quality barrier that is supposed to be peer review, why are journals not doing more to preserve the integrity of the published literature and promote better standards?

For a start, editors and editorial boards frequently focus on the perceived administrative burden CONSORT and other reporting guidelines place on authors. They perceive that taking the time to go back and include important methodological details and then accounting for them in a summary checklist to aid reviewers is too burdensome; a genuine fear could exist that overworked authors might determine that submitting to journals that impose such standards is simply not worth the effort. In other words, the act of compelling researchers to actually provide the information to facilitate what they should have drilled into them at the start of their careers—that all research is based on validation and replication—is imposing too much on them. Apparently immune to ongoing research into the failures of reporting, let alone the attendant implications, journals maintaining a do-nothing position effectively give tacit approval of an approach to peer review that is little more than a cursory check. Under such circumstances, it seems such journals are content to simply determine that authors are not saying anything too outrageous and ensuring that there are as few as possible impediments to letting authors get published.

At the root of this situation, whereby a sizeable chunk of the published literature cannot be validated or methods cannot be repeated (and is, therefore, arguably useless), is an overwhelming lack of comprehension of the issues of poor reporting and even less of an understanding of why detecting reporting problems is not just a constituent part of peer review but perhaps the single most important part of peer review. Sadly, it seems that journals and reviewers are drawn to the results like moths to a flame while critically overlooking how those results were derived. Yet it is not like these are esoteric or indeed rarefied academic conversational points: most authors would likely attest to frustration after having read a paper from which they could not glean enough information to go back to the lab and repeat. A Nature survey even suggested that replication was a major concern for readers (Nature, “Overview”). The situation as it currently stands, and as described previously, represents a collective and overwhelming failure of peer review—doubly so when evidence of the implications of poor reporting and the presence of tools to help correct the problem are well established and readily available.

The failure of peer review to detect reporting problems and in turn spot methodological flaws or the introduction of spin and bias, coupled with a general lack of interest/urgency in addressing the problem, is a very visible failure of peer review. Peer review, it could be contended, is failing to do the job it was set up to accomplish. Some of this failure is built on inertia—a somewhat self-congratulatory belief that despite flaws, peer review works. After all, as a process, it has supported the explosion of research that has ensured there is not a field of academic or scientific study that has not pushed ahead the frontiers of understanding in recent years. Such guilelessness in managing peer review is also derived from a possible collective “arrogance” that buys into and enforces the idea that subject expertise trumps methodological and statistical evidence. Admittedly, that statement immediately falls into the trap we have just condemned—namely, that critiques on peer review are often anecdotal or opinion and not evidence based. However, time and again, political or commercial interests seem to have been prioritized over what surely must have been concerns surrounding the validity of results, especially so at journals where statisticians or experts in study design are retained. This begs the question: In the face of growing evidence that is available and often published in the most prominent journals, such as the British Medical Journal (BMJ) or the Journal of the American Medical Association (JAMA), why do editors insist on perpetuating the mistakes of their predecessors by not enacting more rigorous peer review?

So what would a more scientific approach to peer review look like? First, it would use the wealth of evidence and meta-analysis that shows patterns in author behavior and the writing up of research for publication. The most obvious, as discussed, is poor-quality reporting of methods and results. The problem is that most researchers simply are not versed in what constitutes good research practice. This complaint is not new. Doug Altman bluntly summed up the lamentable state of much published research in 1994: “What, then, should we think about researchers who use the wrong techniques (either wilfully or in ignorance), use the right techniques wrongly, misinterpret their results, report their results selectively, cite the literature selectively, and draw unjustified conclusions? We should be appalled.”

There are countless examples of what Altman rails against. A classic example is the lack of accounting for post hoc analysis. Researchers all too often do not seem to grasp that it is problematic if you (1) do not publish your originally stated research question, (2) do not describe the outcomes of your investigation of that research question, and (3) decide halfway through your study to answer a new research question that looks more exciting. The flaw with that approach is that the study population was designed to answer the original research question, not the secondary question. Is the population sample now biased? Possibly. What even happened to the original research question? Was there a null or negative result? Why are so few negative or null researched published? Certainly there seems to be an aversion to publishing such material (Franco, Malhotra, and Simonovits). In short, there needs to be a wholesale overhaul in the way research methodology is taught. All too frequently, it is left to (ill-equipped) journals to spot study design flaws. That is already far too late in the process, as the authors simply set off on the wrong path at the start of their study. Somehow peer review is expected to put authors back on the right path. It can be a Herculean task at the best of times and simply impossible on other occasions if the flaw is fatal. Nevertheless, it is remarkable how often such problematic work still eventually surfaces, and not only in obscure titles. In the meantime, and as some sort of palliative, journals can impose policies, protocols, and reporting guidelines and compel their authors to conform. In doing so, problems of bias, spin, unethicalness, and weak study design can be revealed. After that, the journal can take the appropriate action: reject the paper or request changes to remedy the problem.

Another scientific approach to peer review would be through the determination of a set of validated reviewer core competencies, followed by their promotion and accompanied by various training programs. Journals, institutions, or publishers could provide the training. All have a vested interest in training because such efforts should not only elevate the quality of peer review—or at least that is the obvious intention—but also lead those who were trained to better appreciate what is required of them when they in turn become authors. Presently, the Centre for Journalology based in Ottawa, Canada, is among those leading the way to best determine what sort of training reviewers of the future need. Ultimately, the problem will be convincing the very people that need it the most that they should undertake training. There is a concomitant movement to ensure reviewers are better recognized for all their work reviewing a paper, but until the movement succeeds, efforts to provide universal standards in training will be hampered by a lack of motivation for what is a volunteer task that has increasingly been seen to be a burden, as opposed to an honor, in the face of an inundation of research papers. Moher and Altman (2015) propose that we should not stop just at training reviewers. Editors too should be provided with a similar set of core competencies, and potential authors should be trained early on in their careers in the art of writing articles “fit for purpose” (Moher and Altman).

A third, more scientific approach to peer review is the better matching of papers to suitably qualified peer reviewers. Presently, the approach most journals take is that editors simply call on the people they know, have seen speak on a topic, or recall having seen in conjunction with a previously published relevant paper. After that, the online peer review management systems most journals now use might have ways of matching known reviewer areas of expertise with the subject matter for a paper. Such an approach is highly dependent on both the accuracy of the author’s description of his or her paper and the accuracy of a potential reviewer’s self-awareness of his or her true area of expertise. One such system (ScholarOne) has now started to offer suggested reviewers based on previously published papers, presumably derived from an algorithm that has not yet been subjected to testing or scientific validation. Consequently, it is entirely possible that large pools of potential reviewers, especially in emerging markets, are not being tapped. The smarter selection of potential reviewers is not the only issue surrounding matching of papers to people; evidence in the literature on patterns of acceptance of invitations to review is pretty much nonexistent. At issue is determining whether there is a threshold for reviewer burnout. Most journals are now swamped with submissions, and it is not always the case that their reviewer pools have expanded in a similar fashion. If that is the case, journals are calling on reviewers with greater frequency. Does there come a point when the best-qualified reviewers no longer have the bandwidth to review and thus turn down the invitations with increasing frequency? Then what? Journals start to scramble for less obvious picks and might even resort to calling on people they are unfamiliar with or have not properly vetted, relying on anyone who accepts the invitation to review. Obviously the concern is then that the reviewer is not sufficiently qualified. Another function of burnout is not so much about a growing disinclination to review but the provision of rushed or superficial reviews.

A fourth, more scientific, approach to peer review is through the application of technology. In an ideal world, every journal would be able to call on a cadre of methodological or statistical editors. Realistically, that is unlikely to happen, but in the meantime, solutions are emerging to better detect flaws in methodological reporting, which in turn can be a signpost to bigger issues of poorly conducted studies. One such attempt is through a program called StatReviewer (http://www.statreviewer.com), which at the time of this writing is being piloted on several hundred journals. StatReviewer parses the text of a submission and highlights areas where it feels there are gaps in reporting. Obviously, human intervention is then still required, but the hope is that the software can at least highlight issues both more quickly and with more consistent accuracy than human reviewers, be they trained in methods or simply subject experts. Perhaps we are looking at the early stages of what could be the next great evolution in the assessment of research, particularly with regards to its publication. While there are many inherent weaknesses in humans performing peer review, it seems the addition of training and the development of new machine-based tools might very well raise the bar of peer review.

Who Is Doing Peer Review?

And so we come to the end of this book by pausing to think about who is doing peer review. As pointed out right from the start, peer review is the manifestation of a community policing itself. But as this book also suggests, the people performing the role are not always qualified, in most cases have never received any training, might not be consistent, might be prone to bias, and are not versed in matters of research methodology, publication ethics, and other issues related to poor author practices. The process of selecting these reviewers is far from transparent and, as journals become increasingly desperate, somewhat scattershot and performed with hope rather than design and intent.

Naturally, in a world where there are more journals (of highly variable quality), more papers, and a lack of a minimum set of standards for peer review, there are many potential gaps that suspect research and dishonest authors can slip through. Consequently, we now live in an age when authors create fake reviewer profiles or spoof actual reviewers to provide fake reviews of their own papers, knowing that many journals are desperate (and careless) enough to simply invite author-suggested reviewers with absolutely no verification procedures. Several large publishers have made more than one purge of papers that were subject to this type of unethical behavior.

In short, it is our opinion that if peer review is to thrive in the face of the rising challenges of a deluge of submissions, potential reviewer fatigue, and the lack of a set of minimum standards, we need to see more of the following:

A serious, research community–wide recognition that there are significant gaps not only in the peer review process but also in the training of researchers regarding both study methodology and writing up research for publication. Institutions ultimately have to take responsibility for this. However, it seems more likely that success will come from elsewhere. Journals, publishers, and scientific and learned societies have a role to play here by holding potential authors to higher standards. Funding agencies also could do more by insisting on standards and withholding future funding if their researchers fail to perform. All these stakeholders can combine forces to help facilitate the development of a universal set of core competencies and a platform from which to teach them. Until there is recognition of a problem and the widespread adoption of a commitment to offer solutions to help all stakeholders raise standards, the same problems will remain. So in discussing who is performing peer review, we could in the future state with authority that it is people who are properly skilled to perform the task.
Greater recognition of the invaluable contribution peer reviewers make and the fact that without effort expended, the entire process of scientific and academic publication would either collapse or become a free-for-all with no satisfactory way of validating any research. This does not mean reviewers would receive financial compensation for their time. Despite the vast wealth of many publishers, the economics of journal publishing are such that the overwhelming majority of titles would not be able to fund any peer review. Institutions, however, could shift the paradigm by including peer-review work within any assessments for tenure and promotion. Journals too could do more to recognize the work of their volunteer reviewers. Many do nothing. Some present annual awards to their “top” reviewers and sundry print a list of everyone who performed a review in the previous 12 months. Instead, journals could innovate and invest a little extra effort in order to recognize and reward hardworking reviewers. This could include providing a profile of the reviewer in the journal or maybe a fee waiver for an open-access publication, an expedited publication schedule for any work they submit as an author, or free/discounted publication services. So in discussing who is performing peer review, we could one day reply that it is people motivated by the recognition of those who generously volunteer their time and expertise to assess a paper and further the research in their fields.
Be it publishers, societies, or institutions, there needs to be greater deployment of experts with training in methods, statistics, or publication creation. Presently only the most well-resourced journals can offer consistent statistical support. Many authors, unfortunately, are not at institutions that provide their staff with the necessary support, so it might have to fall to others to offer help. The same issue applies to the legion of authors who work in private practice or in nonacademic settings. Conversely, as underresourced as so many institutions are, some are attempting a more enlightened approach (Cobey et al.).
Cobey et al. (2016) recently highlighted one effort involving the provision of a publications office role. The concept involves an institution providing a trained individual to all departments to offer guidance on writing up research and navigating the publication and peer-review process. So there needs to be greater recognition of the role of and need for reviewers with specialist knowledge.
All stakeholders need to consider the role of technology to either support or replace a human where feasible. Peer review can be cumbersome and lack consistency. A paper might receive a rough review simply because it was assigned to a particular editor who in turn picked tough reviewers. Equally, assigned to a different editor and reviewers, a paper might glide effortlessly through peer review. Peer review should never be, but far too often is, a lottery. Papers must be judged on merit, free from bias and any deficiencies of the reviewers assigned to assess publication worthiness. Perhaps in the future, we can talk about peer reviewers being supported by validating tools that aid their detection of potential flaws.

A Final Thought

Nothing within this book will prove revelatory to those who study peer review. Arguably, those engaged in the process for many years either as a prolific author/reviewer or as an editor will equally recognize much of the debate we have recounted. However, such individuals represent the proverbial tip of the iceberg. Overwhelmingly, the players in this game are amateurs, yet the outcomes from their research can represent the highest of stakes—life changing, world changing, history making. Even the myriad small-scale papers that make only the tiniest of incremental change all contribute to the corpus of literature that represents the sum total of understanding of any given field and thus have some value. The reality is that there are thousands of journals, hundreds of thousands of research papers, millions of researchers, and billions of research dollars involved in the process. The smartest individuals on the planet are the stakeholders in peer review and publication. The problems are recognized. The solutions are emerging, if not already available. What is now needed is recognition that the process is flawed but is eminently fixable. There just needs to be the will to do more than passively engage in a process that in its natural resting state might, on a good day, do the job but is otherwise lacking. But the system of peer review is not lacking in potential. The corrective measures are attainable, and when that happens, one can argue that peer review will be more robust than ever. Whether journals will represent the medium for delivering the latest research in the future remains to be seen. Peer review, however, will still be around. Hopefully it will look a little different, and maybe a little healthier, than it does now.

« Prev section Next section »

Top of page