A JSTOR Labs Report

June 2017

Abstract

Scholarly books are increasingly available in digital form, but the online interfaces for using these books often allow only for the browsing of portable document format (PDF) files. JSTOR Labs, an experimental product development group within the not-for-profit digital library JSTOR, undertook an ideation and design process to develop new and different ways of showing scholarly books online, with the goal that this new viewing interface be relatively simple and inexpensive to implement for any scholarly book that is already available in PDF form. This paper documents that design process, including the recommendations of a working group of scholars, publishers, and librarians convened by JSTOR Labs and the Columbia University Libraries in October 2016. The prototype monograph viewer developed through this process—called “Topicgraph”—is described herein and is freely available online at https://labs.jstor.org/topicgraph.

Introduction

Scholarly books are increasingly being made available in digital form, joining the print-to-digital transition that scholarly journals began well over a decade ago. Ten years of innovation have produced tremendous benefits for authors and readers of journal literature, and some of this innovation certainly is applicable to the digital migration of monographs. But the long-form scholarly argument presents some very different challenges, and its online migration is still in many ways in its infancy. The platforms that make monographs available to users often offer little in the way of specialized functionality for the different ways that scholars and students use these books—uses that include both immersive reading of the entire long-form argument and goal-oriented “dives” into a book for a specific topic or to mine citations. JSTOR Labs, an experimental product development team at JSTOR (one of the scholarly content platforms that host digital monographs), undertook a user research and design process to better understand the wide variety of needs, behaviors, frustrations, and ambitions users bring to the task of reading scholarly books online and to explore possible new paths to unlocking the value of the long-form argument in a digital environment.

This article intends to do three things. First, it discusses the kinds of uses that readers have for scholarly books and the opportunities for improving the usefulness of books for those purposes in a digital environment. These uses and opportunities emerged from ethnographic research JSTOR Labs carried out with a variety of readers of digital monographs—faculty, graduate students, and others—and with a small working group of scholars, publishers, librarians, engineers, data scientists, and user experience designers that convened in partnership with the Columbia University Libraries in late 2016. Second, the article discusses the process that was used to explore the digital monograph landscape, how problems to solve were identified, and how one new digital monograph feature that JSTOR Labs could prototype was selected. Third, the article describes the process to develop that prototype and introduces the tool, Topicgraph, that was built. The JSTOR Labs team employs two related design methodologies—“design thinking” and “lean start-up”—that are popular among commercial technology companies and startups. We hope that a description of the product development process will be useful for librarians, publishers, and scholars who work on digital scholarly projects or who are simply interested in knowing more about this way of thinking and doing.

Monographs exist as part of an ecosystem of authors, publishers, libraries, researchers, and others. There have been a variety of efforts to explore new forms of the monograph from the perspective of the author.[1] Similarly, there have been many efforts to find new business models to support the entire ecosystem. For our effort, we decided to focus on the needs of the researcher as a consumer of scholarly books (rather than the needs of publishers or authors). Within this scope, we took a deliberately pragmatic approach to defining our subject. The scholars and students we observed over the course of the project used a variety of academic books for their research, including single-topic treatises, thematic collections of essays, and primary source documents. Similarly, many of the ideas defined by the working group at Columbia could be useful for various kinds of books used for scholarly research. With that note, our expectation is that many of the design ideas explored in this project would be most valuable when applied to scholarly works that explore, in depth, a single topic.

The Print-to-Digital Transition for Monographs

Over the past five years, there has been a tremendous increase in the availability of digital versions of academic monographs in the humanities and social sciences. University presses and academic publishers, seeing the excitement around trade e-books sold via Amazon and Apple, took steps to make more of their frontlists and backlists available digitally and to invest in the staff and production tools needed to distribute those digital titles effectively. Academic publishers and aggregators—including Cambridge University Press, EBSCO, JSTOR, Oxford University Press, Project MUSE, ProQuest, and others—launched or greatly expanded programs for licensing university press e-books to academic libraries.

This expansion in e-book programs started around the same time that academic libraries and university presses were sounding new concerns about the extent to which print monographs were being used. A 2010 study of print circulation statistics by collection development librarians at Cornell University found that 55 percent of the books in the university’s collections that were published after 1990 had not circulated by 2010 and that within the first 12 years after acquisition, the likelihood that a given volume would circulate for the first time drops precipitously.[2] Whether this is really a surprise, given the scope of Cornell’s acquisitions, is almost beside the point: in a difficult budget environment for universities, even those academic libraries with the most extensive collecting remits would be unlikely to continue acquiring humanities and social science books at their customary level if they cannot demonstrate usage and impact.

There was hope that the digitization of monographs would be the answer to these troubling indicators of low usage of print monographs and that the greater availability of digital versions would help increase the usage and impact of monographs in the same way that digitization efforts have arguably helped to revitalize the usage and citation impact of backlist journal articles.[3] Early indicators are beginning to validate that hypothesis. Based on the growing usage of e-books on the JSTOR platform and anecdotal evidence that librarians have shared about the usage of their e-book collections, we are cautiously optimistic about the possibility of a comparable renaissance in the use and impact of scholarly books, especially if we can overcome the pain points that readers typically encounter in their research process.

“But even beyond the act of digitizing monographs and making them available in search results on scholarly platforms alongside the digitized journals that scholars and students are already accustomed to searching online as part of their research workflows, there are clearly other opportunities to grow monographs’ visibility and usefulness to readers.”

But even beyond the act of digitizing monographs and making them available in search results on scholarly platforms alongside the digitized journals that scholars and students are already accustomed to searching online as part of their research workflows, there are clearly other opportunities to grow monographs’ visibility and usefulness to readers. For example, it should be possible to find new and better ways to expose the impact of scholarly monographs and, for any given monograph, clarify its “location” within the scholarly record. Which works does the book in question cite, and which works, in turn, have cited that book? Efforts to map citation and impact chains have a relatively long and sometimes controversial history in scholarly journals, especially in the sciences; however, monographs, which until recently were not available in great numbers in machine-readable form, have not as often been included in impact and citation systems. Could the impact of a monograph be rendered visually in ways that readers can grasp intuitively and in ways that do a better job of demonstrating the importance of long-form arguments for stimulating important debates within a given area of study—as instantiated not merely in later books but also in the usage-and-impact-obsessed world of scholarly journals? And could scholarly books, by virtue of their length and depth of treatment of a topic, be represented visually online in ways that help readers to use them as portals of entry to that topic?

Another way to increase the visibility and usefulness of scholarly books is to present them online in ways that help readers take advantage of them for different modes of reading. A survey of scholars about their research practices conducted by our colleagues at Ithaka S+R in 2012 highlighted an interesting dichotomy in how scholars used books. The survey found that scholars tend to prefer e-books over print books for basic research tasks, such as exploring references or searching for specific topics, but when it comes to more immersive reading, they prefer print books. So, a scholar might use an e-book as a sort of quick finding aid before turning to a print copy of the same title to read and digest the argument.[4] (And this initial use of an e-book might very well take place on Google Books, rather than on a specialized scholarly book platform.) It is arguable, however, that this reading behavior is very poorly provided for by digital scholarly books. In many cases publishers of digital scholarly books and the platforms used for them display the books simply as a long PDF or EPUB file—often, it should be noted, with digital rights management (DRM) software attached that restricts uses of the book. Users are arguably locked into a linear, continuous reading experience, without means of easily flipping back and forth between chapters and the book’s index as is possible with a print volume.

These are just two broad concepts among many other possibilities for improving the usefulness of the monograph. Both of the concepts outlined here point to different modes of visualization, or user design, as ways to better demonstrate the impact of monographs and to help readers with different goals and different levels of sophistication with scholarly materials to navigate them efficiently. This project grew out of the question: What might one different visualization look like, and could we build it?[5]

Designing the New Monograph

JSTOR (http://www.jstor.org) is a not-for-profit digital library of scholarly journals, books, primary sources, and other content that is supported by colleges and universities, museums, archives, public libraries, secondary schools, and other institutions of research and learning around the world. In 2014 JSTOR launched a small product development team to investigate and prototype new and leading-edge tools for researchers, teachers, and students. The group, JSTOR Labs (http://labs.jstor.org), seeks to partner with publishers, libraries, and scholars on these development projects. The team that worked on the project described in this article was made up of a user experience researcher and front-end developer, a technical lead, a visual designer, a project manager, and a product owner.

“While there are a variety of innovative initiatives that have created digital monographs with extensive features that would not have been possible in a print format, we wanted to develop a new way of presenting books that could be generalized to as many monographs published in digital format as possible, without prohibitive investment in each incremental book. ”

Given the state of digital monographs as sketched above, the JSTOR Labs team wanted to understand whether there are feasible and scalable ways to improve the usability and discoverability of monographs in the humanities and social sciences—and, in turn, to grow the usage and impact of these titles. The JSTOR Labs team wanted to find ways of doing this that are extensible across disciplines and that could be relatively easily implemented and tested. Perhaps most important, whatever tool or functionality we decided to prototype, we wanted it to work with monographs that have already been published as standard format e-books. Although there are a variety of innovative initiatives that have created digital monographs with extensive features that would not have been possible in a print format, we wanted to develop a new way of presenting books that could be generalized to as many monographs published in digital format as possible, without prohibitive investment in each incremental book. On a practical level, we wanted to devise ways of presenting monographs with only the most basic digital version of a book: a full-text PDF file. Although the kinds of improvements we brainstormed could potentially be extended to journal literature and other digital texts, we concentrated our thinking on needs around the monograph precisely because, to date, there has been a comparatively greater investment in improving the user experience for journal articles, especially in the STEM fields. So far as we can tell, there has been relatively little investment in improving the user experience for humanities and social science monographs.

The JSTOR Labs team’s approach to designing and prototyping tools and functionality draws on lean start-up principles and design thinking, two closely related product development methodologies that have become popular with technology companies over the past decade.[6] Both approaches emphasize the importance of understanding the “big picture” when building a product, that is, the context within which the product sits. As such, they encourage developers of new features or products to gather continuous user feedback over the course of the design and prototyping process. At every stage, the product team should be seeking advice or data derived from users, and that feedback should inform successive iterations of the design and prototype of the product or feature in question. In keeping with these approaches, the JSTOR Labs team tries to gain a deep understanding of the prospective user for any given project and to learn rapidly from user feedback through many quick prototyping iterations of a given feature.

For this project, we wanted to develop a deep understanding of different use cases for digital monographs. We chose to focus our initial inquiry on academic users of scholarly books: faculty and graduate students. Whereas for some scholarly books there are certainly valuable use cases for undergraduates, professional and casual readers, and even secondary school students, we felt that we would produce the biggest improvements for the greatest number of users if we focused on readers in the academy who are likely to engage with monographs regularly.

To that end, our research process had the following steps:

  1. Preliminary user research with scholars and graduate students to get a sense of the ways in which they use monographs.
  2. A day-long discussion with a small working group of scholars, librarians, publishers, data scientists, and visualization experts who could help articulate a set of principles for the visual design of digital monographs, as well as a set of possible design concepts.
  3. The selection of one design concept that would shape the JSTOR Labs team’s subsequent development of a working prototype.

User Research

To understand how to improve our targeted researchers’ experience with monographs, we first needed to understand the diverse ways that scholars and graduate students work with them. To achieve this in advance of our group workshop, we selected an ethnographic approach. Ethnographic user research consists of observing users performing their work in situ, working as they normally would, and pulls together observations made by the ethnographer along with texts, images, and other artifacts collected during observation. This approach provides the context needed to understand the why and how behind scholarly users’ choices and methods for carrying out their research and learning activities. On this project, an ethnographic approach allowed us to understand the actions that people take with both digital and print monographs, the context within which they conducted their scholarly work, and the goals that their actions support. We felt that gathering individual stories of real people and their experiences would be an effective way to help us brainstorm new ways of presenting monographs, because with such stories in mind we would be “solving for” the use cases of these specific individuals, rather than focusing on our perception of the needs of abstract users.

We decided to focus this user research on a single discipline that makes ample use of monographs: history. We recruited six participants at various career stages, each affiliated with a college or university in the Midwest or on the East Coast of the United States (the two regions where members of the JSTOR Labs team work). JSTOR’s user researcher shadowed and interviewed each participant during an average workday. As part of our time with each of these readers, we collected notes and photos to document his or her environment, activities, tools for carrying out research, and motivations.

The user ethnographies were presented in simple visualizations.The user ethnographies were presented in simple visualizations.

We walked away from this research with several key takeaways, the most salient of which was the diversity of reader activities and approaches. Each of these historians had developed and honed his or her own distinct process. Additionally, we found that although each of these individuals expressed a strong preference for using print or digital formats of books in certain circumstances, these preferences did not necessarily dictate their actual use, as each historian needed to interact with both formats to complete his or her work. The final result of these ethnographic interviews was a laundry list of devices, programs, apps, and tools that each historian used. The combination of these tools and distinct processes creates a complex web of activity for each individual.

We compiled these profiles into single-page data visualizations, one for each participant. User profiles for each participant are available in appendix A.

Workshop: Articulating a Set of Principles for Redesigning the Digital Monograph

In October 2016, the JSTOR Labs team assembled a small working group of scholars, librarians, and publishers to talk about the issues surrounding the user design of digital monographs. Our objectives for the meeting, which was hosted by the Columbia University Libraries, were to understand the challenges and context facing researchers using monographs and to brainstorm a set of hypotheses about ways to improve the reader’s experience with digital monographs—hypotheses that the JSTOR Labs team could test with students and scholars in the weeks after the workshop.

In planning the discussion, we sought to include many different viewpoints by bringing together representatives from a variety of scholarly disciplines. We were grateful to have the participation of the following in the workshop:

  • Amy Brand, Director, The MIT Press
  • Robert Cartolano, Associate Vice President for Digital Programs and Technology Services, Columbia University Libraries
  • Seth Denbo, Director of Scholarly Communications and Digital Initiatives, American Historical Association
  • Kathleen Fitzpatrick, Associate Executive Director and Director of Scholarly Communication, Modern Language Association
  • Alexander Gil Fuentes, Digital Scholarship Coordinator, Columbia University
  • Laura Mandell, Professor of English Literature, Texas A&M University
  • Jason Portenoy, PhD Candidate, Information School, University of Washington
  • Barbara Rockenbach, Interim Associate University Librarian for Collections and Services, Columbia University Libraries
  • Jevin West, Assistant Professor, Information School, University of Washington
  • Robert Wolven, Associate University Librarian for Collections and Services, Columbia University Libraries (retired)

From JSTOR, Laura Brown, our managing director, and Frank Smith, director of the Books at JSTOR program and a former editorial director at Cambridge University Press, participated in the working group.

For advice ahead of the meeting, the JSTOR Labs team also consulted Catherine Felgar, former head of production for Columbia University Press; Nicholas Lemann, former dean of the Columbia Journalism School; Jim O’Donnell, university librarian at Arizona State University; and Jason Rhody of the Social Science Research Council.

Brainstorming readers' tasks and goalsBrainstorming readers' tasks and goals

The workshop featured two activities. First, in the morning, after sharing the results of JSTOR’s ethnographic work, we outlined the specific tasks that faculty and graduate students engage in when working with monographs. These tasks ranged from “close read” and “write in margins” to “find an exemplary passage to use in exams” and “explore the bibliography and notes for relevant scholarship.” We described the goals of the same monograph readers, which included “decide whether the book is worth reading” and “understand the book’s position in the scholarly conversation.” We then flagged the hurdles and challenges that these researchers faced, such as “poor writing quality and too much jargon,” “difficulty in moving between print and electronic versions of the monograph,” and “digital rights management software forces readers to read a book with it ‘under glass.’” This discussion was intended to help the working group zero in on a broad set of assumptions and principles about the ideal design of a monograph.

Second, in the afternoon, we brainstormed ways we could help researchers achieve their goals or overcome these hurdles. We accomplished this through a “design jam.”[7] Participants had ten minutes in which to sketch as many ideas as possible for improving the visual presentation and navigation of digital monographs. We then shared our ideas with one another, and participants were encouraged to steal others’ ideas and build on them during a second round of sketching. After two rounds in which over a hundred possibilities were sketched, we highlighted the most promising ideas by “dot-voting”: each participant was given three green stickers to place next to the ideas he or she found most intriguing, giving a sense of which concepts the working group found most promising for prototyping.

Using dot stickers to vote on promising ideasUsing dot stickers to vote on promising ideas

Themes and Concepts for the Reimagined Monograph

The conversation and brainstorming surfaced a set of principles and concepts for reimagining the visual presentation of monographs online—principles that would serve the purpose of helping readers to make better use of scholarly books and concepts that might better expose the inherent value of the decades’ worth of books archived in online databases such as JSTOR. The discussion was wide ranging, but the working group’s comments converged on several key points.

(1) The importance of great writing is a given. As one of the working group members put it: “The quality of writing really matters.” It would be difficult to argue for the value of a monograph that is presented in an innovative way online but that is not rigorous or well written. Our entire discussion was predicated on the idea that, although changing the design or presentation of digitized scholarly books might help make them more easily usable or navigable, the most important thing about the books themselves remains the skill with which the arguments are researched and presented to the reader. No amount of design work can change that, and the working group emphasized that any design work that the JSTOR Labs team would undertake should respect the integrity of the long-form argument as a complete narrative.

The writing is what matters.The writing is what matters.

(2) The ideal digital monograph should allow different kinds of readers to navigate it in different ways. Many online platforms for digital scholarly books display chapters or entire books as a single, scrolling PDF—a format that (quite reasonably) assumes linear, continuous reading of the entire argument. But we know that scholars and students have other modes of reading. In our observations of the participants in our ethnographic user research, we tracked four distinct, common user needs: citation mining, extracting specific information from the book, immersive reading, and reusing or revisiting a text (see appendix A). The ideal digital monograph—dubbed during the workshop a “scholarly Kindle”—would be designed in a way that allows users to switch easily from mode to mode. It would also allow users to engage in the same mode in different ways. For example, researchers’ home discipline, their career stage, or simply their technical proficiency may influence whether they prefer to engage in close reading online, in print, or in a hybrid of the two. Ideally, a digital monograph would be designed in ways that enable those shifts easily—for example, by allowing a user in immersive-reading mode to flag a paragraph or section for subsequent citation. Similarly, the digital monograph might allow the reader to seamlessly switch between reading and annotation mode.

(3) Readers should be given better tools to assess the content of online scholarly books quickly and efficiently. Unsurprisingly, members of the working group voiced a concern about the sheer number of available books on a given topic and about a lack of existing functionality for helping them to make sense of whether parts of a book are valuable to their research or teaching. “How do I quickly understand whether something is worth reading at length? How do I assess the importance of the work to my own research quickly?” Tools for assessing content might focus on giving readers better insights into the topics of a book, a process that could be achieved by text mining or otherwise applying models to large chunks of machine-readable text might allow users to “vote on,” tag, or assess a given book, or use other means to enable readers to evaluate a book’s relevance quickly.

(4) Readers should be able to navigate more quickly to the portion of a book they are interested in. Users sometimes need to home in on extended passages on specific topics or to search for facts to support an argument. Both are goal-oriented approaches that depend on the reader being able to discern where in a book a given topic is discussed—which in turn depends on the accuracy and completeness of the book’s index, the likelihood that a keyword search will be successful, or the quality and specificity of the book’s chapter titles. These search methods are important, but they all have failure points. Even while the working group acknowledged that treating a book as a loosely connected set of chapters does not sufficiently respect the intricacy of a long-form argument, its members agreed that finding new ways to help steer readers more quickly to the parts of the long-form argument that are relevant to their needs could be one important part of unlocking the value of these titles for new and broader audiences.

(5) Readers should be given better functionality for situating a book within the larger scholarly conversation. Participants in the working group mentioned a quick scan of a book’s footnotes or endnotes as a productive way to understand which historical lines of scholarly inquiry the author addresses in the book. But it can be labor intensive to manually review isolated citations from the book and track them down one by one. This process also tells only half the story of the book’s place in the long-term scholarly discussion; a simple scan of citations may reveal how the book has drawn on past scholarship but not what influence the book itself may have had on later books. “The ability to understand how what you are reading now has been cited after its publication seems like a missing piece,” one working group participant said, noting that this would be possible with the use of linked data and citation networks. The ability to position a book—and its constituent parts or arguments—within the scholarly discussion of which it is a part would be quite valuable to researchers.[8]

The book, in the network of scholarly conversationThe book, in the network of scholarly conversation

(6) Readers should be able to “flip” between sections of a digital monograph as easily as they can in a print book. The apparatus of scholarly monographs—endnotes, indices, and other devices—are crucial tools for assimilating a long-form argument. “I actually read the endnotes of a book first [before reading the main text] to understand the concepts being presented,” one participant said. But these tools arguably have not transitioned well to the digital environment: readers find themselves pressing CTRL+F to execute simple keyword searches on a PDF, moving back and forth between the main text and the notes (which might be presented by a publisher or vendor as separate PDF files). One participant expressed the desire “to be able to shuttle among different aspects of the text” as easily as flipping pages in a print edition.

(7) In an ideal world, readers would be able to work simultaneously with both a print and digital edition. Each of the participants within the user research study, as well as many of the workshop participants, worked with both print and digital books, depending on context, availability, and their immediate goal. Many worked simultaneously with both print and digital editions of the same book: for example, a reader might read and annotate a printed book while cutting and pasting relevant passages of the same material in a digital version into their “notes” file or a citation management system. It would be ideal if this synchronicity could be maintained by means other than manual page-turning—if, instead, the digital version of a book could “sense” when a page had been turned in the print version. This could be accomplished, for example, by a setup in which the camera of a computer or phone would read the physical page and then text matching would be used to jump to the proper section of the digital book. (On a very practical note, the working group observed that standardizing the pagination of digital and print editions of the same book would be a good starting point.)

Print and digital synchronicityPrint and digital synchronicity

(8) It should be easier to use digital books simultaneously with other scholarly resources, including primary texts, reference works, journal articles, and other books. As Jim O’Donnell of the Arizona State University Library pointed out to us: “The image of the scholar at his ‘reading wheel’ like Gabriel Harvey, or of the satirized pedant with a desk piled high is an accurate one. . . . A given book is readable and usable when the right other books are open next to it and comparison and movement back and forth are facilitated.”[9] The interoperability that this would require suggests the opportunity to use linked open data and standard identifiers, such as an Open Researcher and Contributor ID (ORCID) or International Standard Name Identifier (ISNI), which could facilitate the easy, standardized movement from the free-form text of a monograph to related content. This interoperability might also provide greater ability to link a book with the datasets that underlie it, as is supported by the Mellon-funded Fulcrum publishing platform (http://www.fulcrum.org).

(9) Digital books should be able to “travel” easily from device to device. It would benefit readers to be able to move not only between print and digital editions more easily but also seamlessly between different devices and thus be able to take advantage of those different digital environments to facilitate different types of research and user behaviors. For instance, the same digital edition could be optimized on desktop screens for comparing and annotating across texts, on mobile devices for swiping and tapping through more goal-oriented tasks, and on tablets for a wonderful immersive reading experience. In general, scholarly content has not been formatted or presented online in ways that take advantage of mobile devices. Similarly, we hope that the same digitization techniques that can power a more portable digital edition may also support better accessibility for visually impaired readers.

“The Scholarly Reader,” supporting multiple modes of engagement“The Scholarly Reader,” supporting multiple modes of engagement

(10) Readers should be able to interact with and mark up digital books. The working group returned several times during the discussion to the importance of interacting with a text: annotating, highlighting, and copying and pasting passages. Yet they had the sense that relatively few digital scholarly platforms have functionality to support these activities. This emerged as a particular frustration, especially because scholars may have already developed complicated, idiosyncratic systems for marking up print books. Said one participant in the working group, “I use different colors for annotations to give myself different kinds of signals about the type of annotation: argumentative, fact-checking, rewriting, and so on.” (Although the group perceived a lack of functionality in these areas, several initiatives in the scholarly communications community, such as Hypothes.is, are working to address the challenges around annotation—a gap suggesting there is much progress yet to be made both in the development of these tools and, just as important, in fostering their widespread adoption.) The working group felt that any technology platform solution for scholarly e-book annotation should 1) offer a standard export feature for personal notes; 2) support a range of sharing options, from private and group to institution wide and public; and 3) enable the long-term accessibility and preservation of the annotations. “The annotations,” one participant said, “have to be able to escape the book file.”

(11) Readers should be able to interact with books in collaborative environments. Reading is, for good reason, typically thought of as a solitary activity, but the working group returned over and over to the possibilities for sharing—whether with a private and defined group or with the world at large—readers’ notes and embellishments on digital book files. The group also identified very practical use cases for collaborative reading. For example, the qualifying exams for graduate degrees require students in the humanities and social sciences to become proficient with a very broad range of foundational literature, adding up to many hours of reading. Shared annotations and other forms of digital “group reading” could help graduate students, who often work in very narrow subdisciplines, to become familiar with the canons of their specialized areas more efficiently, allowing for collaboration “not just among students at the same institution, but among students across institutions,” as one member of the working group put it.

(12) Ideally, digital book collections and aggregations would offer the opportunity for serendipitous discovery—the “library stacks” effect. Everyone in the working group had an affection for the experience of wandering through the stacks of an academic library and coming across the book you never knew you needed (a fitting sentiment for a meeting hosted only a few yards from the stacks of Columbia University’s main library). There are a number of highly creative and usable online tools that offer users a more visually engaging browsing experience for e-books: one of the best, the Harvard Library Innovation Lab’s StackLife viewer (http://stacklife.harvard.edu), allows users to browse e-books and print books records in different contexts, such as by placement in Harvard’s physical library shelves, by subject heading, and in order of most checked out or circulated titles. But the impression of the working group is that few publishers or content platforms in the scholarly world have put similar thoughtfulness into their own browsing and navigation structures for e-books. Many lack even the functionality to offer automated recommendations of similar books—functionality that is well over 15 years old for commercial sellers of digital books and other content.

(13) Digital scholarly book files should be open and flexible. This is as much a design question as it is a business question for publishers and libraries. The working group returned several times to the importance of scholarly book files being available in nonproprietary formats that allow for a variety of uses and re-uses. “The flexibility of being able to read a book wherever and whenever—even when moving from device to device—feels important to me. I want my books to be genuinely mobile,” one member of the working group said. Another pointed out that the backlist corpus of scholarly books in the humanities and social sciences is an invaluable resource for text mining, but the ability to carry out that research at scale means that the underlying text of the books must be easy to extract. “It’s so important to be able to ‘scrape’ the text,” one participant said, using a common term for gathering machine-readable characters from a human-readable artifact (e.g., a scanned page image). Another said she needed a system that didn’t force her to “read the book as if it was under glass.” Many publishers and vendors have been reluctant to distribute digital books—and especially recently published books—without DRM software, which restricts the ability of a reader to share a book file or “migrate” it from device to device. Publishers fear that doing so may damage book sales and, over time, seriously erode their ability to recover their costs and support their editorial and peer review activities. As we will discuss in the following section about the design of a prototype, we used books that are hosted on JSTOR and that JSTOR has permission to make available without restrictive DRM software—albeit in PDF format, a format that until recently had been proprietary and yet had become standard for reading digital scholarly materials. Whether a wider group of publishers and technology vendors will feel that they can enable these more expansive uses of a book file without upending the sustainability of the scholarly publishing system is a larger question than this project sought to answer.

The 13 principles above cover a broad range of concerns around digital scholarly books—not just their design but the technical, legal, and business concerns that underpin scholarly communications at a system-wide level. There are enough challenges and opportunities identified here to fuel an ambitious agenda of collaborative experimentation for years to come. The JSTOR Labs team sought to identify a specific design improvement that could address several (but not all) of these principles, be immediately useful, and be implemented and tested with users very quickly in the weeks immediately following the workshop.

Selecting a Concept for Development

Drawing on the workshop’s scores of ideas, over one hundred individual sketches, and dot-voting exercise results, the JSTOR Labs team winnowed the list of potential concepts to explore based on the following criteria. We eliminated some because they were ideas that others in the community would be better placed to develop. For example, the many ideas around scholarly annotation might be better addressed by an organization like Hypothes.is. We removed others because we feared that they were technically infeasible or would be challenging to scale. For example, visualizing the citation network leading to and descending from a monograph would likely require a substantial investment in each book’s metadata for it to be effective. We were excited by this idea, but for this design sprint, we were aiming for a concept that, if proven valuable, could be leveraged quickly and easily across the tens of thousands of monographs available on scholarly e-book platforms.

One workshop idea: “The Book-as-Portal-to-Other-Scholarship”One workshop idea: “The Book-as-Portal-to-Other-Scholarship”

Even after employing these filters, we were left with a handful of exciting ideas: “The Way-Better Table of Contents,” “The Topic Explorer,” “The Scholarly Reader,” “The Book-as-Portal-to-Other-Scholarship,” and “The Scholarly Influence Graph.” To help us choose among them, we carried out another user feedback exercise at the Columbia University Libraries, after the working group meeting. For each of these concepts, we put pencil to paper and created simple prototypes. The prototypes had just enough detail to convey the basic idea but not so much that users would focus on distracting details.

We then showed these prototypes in one-on-one interviews with six Columbia graduate students and faculty in humanities and social science disciplines. They were by no means a representative sample, but we wanted to get an impression from a group of researchers about the usefulness and intuitiveness of the various ideas for an experimental interface. Did they understand what was being proposed, and could they imagine it as helpful for their research process? Would it duplicate tools that they already use, or would it improve on them? These users were especially drawn to “The Book-as-Portal-to-Other-Scholarship,” a concept that turned the book into a vehicle for discovering other, related content, and “The Topic Explorer,” which helped users to better understand the topics and subjects covered within a book. Users told us that although the first proposed prototype could be helpful and better than alternatives, it would meet a need for which they already had solutions, such as using the library’s online catalog search. By the end of the day, we had decided to develop the topic explorer tool.

Another workshop idea: “The Topic Explorer”Another workshop idea: “The Topic Explorer”

In the following weeks, we proceeded to incubate the topic explorer concept along two concurrent paths, the first of which was to develop the data and infrastructure needed for the prototype, and the second of which was to develop the actual prototype through iterative testing with users.

Developing the Data and Infrastructure

In developing this prototype, we were able to build on work that JSTOR has had underway for the past few years. JSTOR has been exploring approaches for algorithmically characterizing texts: automatically tagging or classifying texts based on entities associated with those texts, such as the specific topics that the text discusses, people or places named in the text, and so on. As much of the content in the JSTOR archive consists of unstructured text (primarily generated via optical character recognition [OCR] scanning), the ability to analyze and automatically categorize these journal articles and books is essential for building more sophisticated discovery and recommendation tools. One promising approach JSTOR has been developing involves the use of a custom-built, hierarchical, controlled vocabulary of concepts; a rule-based engine for tagging documents with one or more of the concepts; and a topic model and inference engine.[10] Using these tools, we combine a human-curated thesaurus and rule set with computer-based text analysis to associate texts (and portions of texts) with concepts from the controlled vocabulary. This allows us to both identify portions of a text that are likely to be on given topic and to name those topics using the terms from the human-curated thesaurus.

“Exposing this data in a suitable visualization would, we hoped, provide readers with both a bird’s-eye view of the document as a whole and a convenient means for quickly navigating to a specific section of interest.”

In our prior applications of this text analysis approach, the concepts were associated with complete documents, such as scholarly articles. During the technical feasibility stage of this project, we performed some tests to help us understand whether the approach would work on partial texts—in particular, those documents for which there is no markup to delineate sections or chapters. To carry out these tests, we segmented a monograph into smaller portions and then associated topics with each of those portions, thus identifying “hot spots” for a given topic within the larger monograph. Exposing this data in a suitable visualization would, we hoped, provide readers with both a bird’s-eye view of the document as a whole and a convenient means for quickly navigating to a specific section of interest.

Building the Prototype Monograph Viewer

The second path to developing the topic-explorer concept involved designing the interface that would expose this data. We knew that we wanted to visualize the topics within a book and to use those same topics to help readers navigate to relevant pages within the book, but understanding how best to meet those user needs required further design iterations. Over the coming weeks, we conducted multiple rounds of user testing with an evolving design to home in on an interaction that users would find both intuitive and powerful. These design rounds began with grayscale wireframes, but as we got closer to something that users both understood and were eager to try out, we switched to high-fidelity mock-ups (i.e., fully designed versions of several relevant web pages that are not actually live for use). Through these iterations, we explored a variety of ways to present the topic data visualizations, ranging from treemaps to line graphs.

Early design iteration: Grayscale wireframeEarly design iteration: Grayscale wireframe

We also tested a variety of ways to navigate from a topic heading to relevant sections of the book. Researchers told us that they usually look at anywhere from five to twenty pages of a monograph online before deciding to download the full book file or acquire a print copy. Our goal for this tool was to make it possible to conduct that evaluation more effectively. The tool should allow users to better target the pages they look at and then more quickly evaluate the usefulness of those pages to their research , allowing them to view more pages in the same amount of time. This led us to two key findings. First, although we originally presented the topic browser as separate from the full-text reading experience (as in the high-fidelity mock-up), we found that placing the page viewing directly next to the topic visualizations gave users the ability to more easily navigate between the two, increasing the number of pages they might use for an evaluation. Second, highlighting within the page enabled users to skim through a page more quickly, although this highlighting needed to be turned off when users began to close read. We then worked with the data and infrastructure to implement both these changes. After trial and error, we were able to embed both functionalities in the prototype interface.

Early design iteration: High-fidelity mock-upEarly design iteration: High-fidelity mock-up

With this work completed, the JSTOR Labs team returned to the Columbia University Libraries for a week of rapid development. With the collaboration and support of Columbia University Libraries staff, we conducted more usability testing with faculty and graduate students, further refining the tool by improving aspects of the user experience (e.g., adding the table of contents as an additional means to navigate) and providing more information to help users understand the tool and its topics. By the end of the week, we had a completed tool, Topicgraph, with a design that users understood and were eager to use.

The completed prototype, available at http://labs.jstor.org/topicgraph, includes a small collection of university press–published scholarly books from a variety of disciplines in the social sciences and humanities. (We are grateful to Cornell University Press, The MIT Press, University of Michigan Press, University of California Press, and UCL Press for allowing their books to be part of this experiment.) These books were processed from PDF files. For some newer titles, the books are born-digital files, but for many of the older titles the PDF files consist of scanned images of the original print pages with OCR text. This is consistent with one of our goals for this project: to engineer a viewing solution for monographs that would not require any special formatting of the underlying book files. For some of the books included in the prototype, we could take advantage of chapter-level metadata, allowing us to show chapter breaks in the topic graph and to display a table of contents.

Next to each book, the tool displays the top 15 to 25 topics associated with the book, along with a graph that users can click on to navigate to pages associated with each topic. Because we used a controlled vocabulary of concepts and topic modeling, as noted above, these are not simple keyword matches. Each topic in the topic model is composed of many individual terms that suggest that the topic is being discussed. The more these terms are used in proximity to one another, the more likely that a particular topic is being discussed. For example, if the terms carrots, seed, harvest, and backyard are used in close proximity to one another, the topic model might suggest that the topic being discussed is gardening, even if the word gardening itself is never used in the book. In the interface, the terms associated with gardening are then highlighted within the page when a user clicks on the gardening graph.

The Topicgraph prototypeThe Topicgraph prototype

Testers found the user experience to be relatively intuitive and a useful augmentation of the means that they currently use for assessing the relevance of books to their research, such as skimming the book’s table of contents or conducting quick keyword searches. They were also eager to explore the tool for books and subject areas they were familiar with in order to evaluate the quality of the topics identified in the tool interface. The results of that exercise were mixed; for some books, the topics identified by the algorithm and the associated highlighted keywords met their expectations, but for others, they did not. This test highlighted one shortcoming of the tool: some topics in our topic model are well formed and robust, whereas others are less so. (The topics that are most robust tend to align with the content strengths within the JSTOR corpus—that is, the more content that JSTOR hosts on, say, the history of capitalism, the better informed the algorithm will be in identifying key topics in a book on that subject.) In the eyes of these users, the extent to which the Topicgraph viewing tool is useful depends entirely on the quality of the topics raised. A poorly formed topic can lead to either false positives (wrongly attributing a section to that topic) or false negatives (failing to attribute a section to that topic). So, an important avenue for future development of a tool such as Topicgraph would be to continue adjusting the algorithm in ways that improve the quality of the key topics it identifies for any given book.

To support the desire of users to evaluate the topic model with content with which they are familiar, and to analyze documents not in the JSTOR corpus, we also developed an experimental “Topicgraph my document” function. With this feature, users can upload PDF documents of their choosing and create topic graphs of those documents.

We share this work in progress with the community in the hope that what has already been learned and built will be valuable and that it might catalyze further discussion and solutions. If there is interest in the community, there are potential next steps to explore for Topicgraph:

  1. Gather community and user feedback. Over the course of this project, we have collected a great deal of qualitative data from users and the panel of experts assembled at the workshop. We are eager to add to that the feedback and expertise in two ways. First, gathering feedback and insights from the larger scholarly community will help to ensure that this tool can be as broadly applicable as possible. We are also eager to supplement the qualitative data from users with quantitative data based on actual usage of the tool. Analytics of the working site will help us to see which features and tools are most used, while social media shares of the site will be a strong indication of overall interest.
  2. Further develop and refine the topic modeling approach. This tool is only as good as the data that supports it. Early indications are that for many disciplines and titles the current implementation, which takes advantage of a topic model based on JSTOR content and metadata, is sufficient but can be significantly improved. For example, it would be beneficial to work with subject matter experts to identify “training documents”—that is, text documents that include typical vocabulary for a given content area and that thus can be used to “train” the topic modeling software—for each subject area. It would also be interesting to the Topicgraph with different topic models based on other collections of digital scholarly texts than the content set hosted by JSTOR.
  3. Explore incorporating this tool into platforms at the point of evaluation. We hope that, if additional community and user feedback warrant, publishers and platform providers will explore incorporating this tool or one like it into their platforms. To facilitate this, we have made all of the application code open source. It may also be worthwhile to explore means to integrate this tool on other platforms. For example, an API or embeddable widget may make it easier for platform providers, whereas a browser plug-in may be useful for end users who want this functionality wherever they might go to do their research. Incorporating this tool into other platforms may also provide the ability to use the topics identified to analyze corpora and traverse them in new ways. Users could see trends of topics over time and across disciplines and use those topics to browse and discover material.

We are eager to hear feedback on the tool and welcome comments and suggestions at labs@ithaka.org.

Closing Thoughts

The prototype Topicgraph tool is, of course, just one way in which to reimagine the digital monograph. We identified plenty of other ideas that are ripe for exploration, such as how to visually represent a single monograph in the overall network of citations, and many experiments are already underway, such as the system-wide, flexible, open annotation solution being developed by Hypothes.is. Our working group also pointed to other challenges of the monograph that have little to do with its digital representation: for example, what might be a viable long-term business model for monographs and whether publishing monographs in a free-to-read, open-access model can be made sustainable. Another concern is to ensure that monographs that include nontraditional, born-digital elements are evaluated fairly in tenure and promotion processes. Still another consideration is how to ease the process for text mining across a wide range of the monographic literature without having to secure permission from the hundreds of different publishers within the scholarly communications ecosystem.

“The reimagined monograph—whatever that ultimately means—will not be built in a single step or by a single organization. Libraries, publishers, scholars, scholarly societies, and others will all have a role to play in promoting standards, in convening thinkers, in carrying out technology development, and so on; in doing so, they will be drawing on the wonderful history of collaboration in the scholarly communications community.”

What these challenges have in common is that many, if not all, of them are bigger than any single organization or group. The reimagined monograph—whatever that ultimately means—will not be built in a single step or by a single organization. Libraries, publishers, scholars, scholarly societies, and others will all have a role to play in promoting standards, in convening thinkers, in carrying out technology development, and so on; in doing so, they will be drawing on the wonderful history of collaboration in the scholarly communications community. The Topicgraph prototype, and the design process that informed it, may be just one small piece of what is possible. We look forward to working with others in the community on this and other initiatives that will help make the monograph as useful, innovative, and broadly available as it can be in the digital environment.

Acknowledgments

The working group’s discussion and subsequent e-mail follow-ups provided the development team with a rich foundation of ideas from which to work. We gratefully acknowledge the participation of Amy Brand, Robert Cartolano, Seth Denbo, Kathleen Fitzpatrick, Alexander Gil Fuentes, Laura Mandell, Jason Portenoy, Barbara Rockenbach, Jevin West, and Robert Wolven.

The Columbia University Libraries and the staff at Butler Library generously offered to host the workshop and helped the development team to test interface ideas with graduate students and faculty.

Nicholas Lemann of Columbia University, James O’Donnell of Arizona State University, and Jason Rhody of the Social Science Research Council provided invaluable advice.

Several readers offered comments on a publicly available draft of this article, including Vlad Atanasiu, Martin Paul Eve, Siobhan Leachman, Jared McCormick, Curtis Michelson, Darrin Pratt, Jeff Pooley, Kizer Walker, Don Waters, and Charles Watkinson. We thank them for their advice and guidance.

Appendix A: User Profiles

The following user profiles are from a set of ethnographic studies that JSTOR’s user researcher carried out with six scholars and graduate students in the history discipline in preparation for the working group meeting. (Certain identifying details have been changed or omitted from the public version of this report.)

Click here to view all of the profiles.

Andrea

Overall, Andrea is organized and intentional about her time and activities. From study to social groups and exercise, she is conscious of planning and executing in effective and consistent ways.

Andrea is very aware and evaluative of the methods she uses to keep herself organized and on track. She utilizes a bullet journal for daily, weekly, and monthly planning, which she refers to as her “analog journal for the digital age,” in addition to using OneNote for two-week planning of work on her thesis. She even attended a dissertation boot camp to develop her skills in breaking down and tracking her own work.

During the first half of the day spent with Andrea, she was working out of a single book. Although the secondary analysis of the book is not her focus, the time frame and region discussed in the book are relevant to her work. She is using citations in this text to identify which items she will want to view at various archives she will visit this year. She types each citation into Google Books and checks the location; if it is available in an archive that she is visiting, she will then look at what is said in the book related to that citation. If the reference is valuable, she adds the citation to an Excel spreadsheet with notes. She also visits the various archive websites to get call/catalog numbers as part of this process. Although she does not prefer digital books for all types of work, for this process she would have preferred one, but her library didn’t own a digital copy of the book in question.

Beth

At the time of interview, Beth was engulfed in studying for qualifying exams in US history, which required her to read and review over 150 books in just three months. She struggles with her desire to read each book end to end, and she finds she does not have enough time to do so. From others in her program, she has learned about the “Grad Student Read,” which she describes as reading a book’s introduction, conclusion, table of contents, and a few chapters. She also made reference to “gutting the book,” which is reading just enough to pull out a quote or two. She feels that she might be more successful if she were able to use these adjusted reading approaches.

Given her need to engage with so many texts, Beth has developed comfort and competence with many applications that help her navigate different book formats and availabilities. For example, she uses TurboScan to take PDF-like photos, ultimately creating her own PDF versions of physical documents. At times, she will also transform a digital version from a given format to PDF, which she prefers because a PDF format is compatible with many programs.

The stress of preparing for these qualifying exams has taken a physical toll on Beth and some of her classmates. She describes knee injuries, teeth grinding, and back and vision problems, all stemming from stress and extended study sessions.

Tiffany

Tiffany is currently working on her dissertation and job hunting for the 2017 fall semester. She struggles with the context switching needed to finish her dissertation, prepare her résumé, and search for employment. At one point in the day she becomes frustrated when an e-mail comes in that she feels she must respond to although she is focused on a different task.

Tiffany’s primary task for the interview day was creating a sample course to include in her résumé package as an example of the type of courses she would bring to an institution. She was just beginning this work and explained that at this early stage she is looking for resources that will help her either spark new ideas or refine existing concepts for the course. The content and purpose of the course are still in development. In her own words, “I’m trying to figure out what I’m trying to put across.” In this process, she bounces between various websites, Google searches, and documents. She did not seem to have any formal process, nor was she keeping track of what she had looked at.

This course-creation task stands in contrast to Tiffany’s work to finish her dissertation. Whereas US historians take archive trips over several years and would have final research travel during this phase, Tiffany notes that as a historian with an international focus, she had one year of archive travel during which she spent time between archives in India and London. She had one chance to collect all of the archival documentation she needed, which required being very prepared ahead of travel. Additionally, because of limited space during international travel, Tiffany scanned 20 books from the university library in advance of her travel so that she could take them with her.

Karen

Karen conducts research in support of the courses she teaches and for the book she is beginning to write, which will be her second. In reflecting on the process she used in writing her first book (which took 30 years to complete), she recognizes that much has changed with regard to technology. She desires to adapt the process she used when working on her first book; for example, while writing it she created her own catalog and filed every source within that structure. What it seems will not change for Karen is her reliance on paper. She uses paper to keep track of all sources; even in cases where she engages with digital resources, she always obtains a printout or some physical copy before using the content. At this point she has six file drawers full of printed text, a reduction after she recently downsized to move from the Midwest to the East Coast of the United States.

Further highlighting her dependence on paper documents, Karen owns a specialized digital camera for generating her own printed materials. The camera has a “text mode” specifically designed to take photos of printed text. As she is reviewing print materials, when she finds a useful section, she takes photos of the pages, uploads them to her computer, prints them out, and then deletes the photos from the camera and computer. She then reads, annotates, and files the printed copies.

When she left the Midwest for the East, Karen was also moving from a large university to a small university, and she feels the impact of that in her research budget. She describes the move as going “from feast to famine” in regard to financial support of research resources and conference attendance. She is now able to attend only one conference a year that requires travel. She chooses to attend small, topic-specific conferences for networking instead of the larger conferences. She does, however, still use the large conference catalogs and other documents to see who is presenting and track down interesting publications.

Aaron

Aaron conducts the majority of his research from his home office in a historic East Coast neighborhood. The walls are lined with bookshelves, on which he has categorized sections to his own needs. Although he does reference these physical books, and values the context that their covers and texture provide, the majority of his work is done on his computer and within a few select computer programs. He uses ProCite, which houses all his notes, comments, uses, and reference information going back over 20 years; this includes thousands of individual entries. Within this program, he has created his own taxonomy and fields (such as journal information, call number, language, and frequency of publication). The program is dated and requires significant workarounds. For example, apostrophes cause the program to delete sections of text, so when he includes text from other sources he copies the work into a Word document and manually replaces the apostrophes with another keystroke. Even with all the issues Aaron experiences with ProCite, he continues to use it because he believes he would not be able to retain all the information he has collected over the last 20 years were he to move to another program. When asked if he fears this program becoming unusable, he said, “I try not to think about it.” In addition to ProCite, he uses Adobe Professional and Adobe Acrobat to collect, catalog, and save digital sources. He even takes downloaded book chapters and stitches them together so he is able to save full PDF versions of digital books.

Angela

Angela is an affiliated scholar at a midwestern university—a status that offers no financial compensation. She has been at various universities for two-year stints as an adjunct faculty member. She describes that lifestyle as stressful and taxing, with very little pay. She is currently working in a university cafeteria to support herself.

With these frequent moves and transitions between institutions, Angela has several times found herself without access to academic resources. At one point she even shifted her focus of study from earlier to more current social movements, so that she could make greater use of open web and news sources in her work.

On the interview day, Angela began her work with free writing, which she often does to begin her day. As she describes it, this process is intended to function as inspiration and may ultimately turn into a conference presentation or publication. To do her free writing she uses one continuous Word document. She scrolls to the bottom of the 70-page document, enters the date, and begins writing. The writing is fairly unstructured; sometimes she adds specific notes and citations from books, other times she is simply expressing her thoughts or ideas, without reference to any source. The document she was working on this day represents three years of writing. She explains that at times she will revisit notes from previous dates. She has no particular method for doing this; she just scrolls and scans the document.

Appendix B: Landscape Review

The basic motivation behind the Reimagining the Digital Monograph project—to harness the power of the digital environment to change the presentation of the book—is nothing new. Almost since the introduction of widespread Internet access in the United States, scholars, librarians, publishers, technology intermediaries, and others have been experimenting with new ways to reshape the most traditional and durable of content formats around the most revolutionary of technologies. These experiments have been as diverse as the content found within the books they sought to reinvent, but many have focused on scholarly books and support for the researchers using them.

One early and foundational experiment in producing a digital scholarly book was the American historian Edward Ayers’s Valley of the Shadow: Two Communities in the American Civil War, a web-based project founded in 1995 that gathered digitized primary source objects about two counties in Virginia and Pennsylvania.[11] Although curated collections of digitized primary sources are now quite common, allowing users to filter through a carefully selected set of historical documents in their own ways—and thus to construct their own narratives—was arguably revolutionary at the time. Indeed, the very question of whether The Valley of the Shadow constituted a new form of scholarship and argumentation—and even whether it counted as monograph in the first place—served as a source of vexation to at least one reviewer: “If the publicity [for the project] is right that ‘history may never be the same,’ Valley must show that it enables the reader to ‘take control’ in a way not made possible by any publication of rich primary sources. Nowhere does Valley begin to defend that argument.”[12] Nevertheless, the project is frequently cited as a touchstone by other scholars who work on digital book projects. (An interesting footnote to the meta-discussion of whether or not The Valley of the Shadow could rightly be thought of as a monograph: the work seemed to attract academic reviews only when Ayers and Anne S. Rubin published a print-book-with-CDROM distillation of the website with W. W. Norton in 2000.)

Around the same time, the American Historical Association (AHA) and Columbia University Press collaborated to solve a number of perceived ills with the print monograph in the history discipline. The Gutenberg-e program, which was underwritten by the Andrew W. Mellon Foundation, enabled the publication of first books by young historians working in specialized subfields. The aim of the program was not only to subvent the publication of books that scholarly presses might not otherwise take on because of the narrow audiences for such specialized subject areas (even by the standards of a university press circa 2000) but to do so in digital form and in ways that would encourage the embedding of primary sources (both text based and in other formats) alongside the scholarly argument and enable potentially different ways of presenting a scholarly argument. Then-AHA president Robert Darnton described his vision of the kind of book the Gutenberg-e program would enable to be published:

A new book of this kind would elicit a new kind of reading. Some readers might be satisfied with a quick run through the upper narrative. Others might want to read vertically, pursuing certain themes deeper and deeper into the supporting documentation. Still others might navigate in many directions, seeking connections that suit their own interests or reworking the material into constructions of their own. In each case, the relevant texts could be printed and bound according to the specifications of the reader.[13]

A 2004 scholarly review of the program argued that the books published through the program “are technically impressive” but that “their electronic form does not yet provide a new sort of interpretation or new ways of reading history.”[14] The digital form of the books introduced linked and embedded images and other materials—no small feat at the time—but did not bring about more radical redefinitions of how a scholarly argument might be presented. The Gutenberg-e program ended the publication of new titles in 2009.

As new digital book (or book-like) initiatives such as The Valley of the Shadow and Gutenberg-e launched, the potential problems around a lack of standards for technology and production became more apparent. As the former director of the Gutenberg-e program wrote, “The early [Gutenberg-e] e-books, in particular, were designed and built as customizable projects, rather than reproducible templates. Although this system resulted in highly original and innovative publications, it was also expensive in terms of time and staff.”[15] Several tools and platforms attempted to address this challenge by offering ready-made solutions to launch born-digital books in formats other than simple PDFs. Sophie, a software package for authoring books and journal articles that incorporate multimedia elements, was first released in 2007. Scalar, a software package for a similar set of uses, was developed later by Tara McPherson, a film studies professor at the University of Southern California, with early participation from a respected press that published a book via the platform.[16] A variety of initiatives for publishing new forms of scholarly books, including those that incorporate multimedia elements, have been announced over the past several years. Many of these originated with funding from the Andrew W. Mellon Foundation, such as a library-press collaboration at West Virginia University that is developing software for assembling and displaying multimedia-rich books and journal articles.[17]

Despite the development of these innovative projects in the 1990s and early twenty-first century, the pace of digitization for scholarly monographs in the humanities and social sciences arguably lagged far behind. Scholarly publishers started to make monographs available online in reasonably large numbers at least as early as 1998, when NetLibrary, an early e-book aggregation for libraries, launched.[18] The real inflection point, however, seems to have come later, around 2009, when university presses took note of the growing success of commercial e-book projects (such as Amazon’s Kindle and Barnes and Noble’s Nook) and pushed to make more of their titles available in digital format. In that year, a group of American university presses received a planning grant from the Andrew W. Mellon Foundation to develop an institutional sales program for scholarly e-books. That initiative, the University Press Content Consortium, eventually settled on the not-for-profit scholarly aggregation Project MUSE as its technical platform and sales agent University presses also began to explore placing their books with other aggregations that developed in the same time frame, including the University Press Scholarship Online platform developed by Oxford University Press and JSTOR’s Books at JSTOR program.[19]

With gradual improvements in user experience design and better understood standards around the production of digital scholarly books, several initiatives are working to make standard (i.e., primarily text-based) e-books more easily discoverable and usable. One is CommentPress, a project pioneered by the not-for-profit Institute for the Future of the Book at New York University Libraries. CommentPress is a tool that allows authors or publishers to “reflow” the text of a monograph (or any other content format), making the text easy for users to annotate and comment on. (The tool was memorably used on Kathleen Fitzpatrick’s Planned Obsolescence: Publishing, Technology, and the Future of the Academy [2009] to enable a public back-and-forth between author and peer reviewers, as well as further commenting at the paragraph level by the public at large.[20]) Another such effort is UPScope, an initiative of the American Association of University Presses modeled on a project of the National Academies Press. UPScope uses subject keywords from book files to present a visual map of related books.[21] Of all the projects mentioned here, these are perhaps closest in spirit to the Reimagining the Digital Monograph project.

A discussion of past innovations in the digital presentation of scholarly books can hardly be complete without a nod to the introduction of Amazon.com in 1994 and its Kindle e-book product line in 2007. Amazon’s creation of a standardized e-book experience—encompassing a broad range of commercial publishers, formatting books in a uniform way that is optimal for immersive reading, making e-books readily available to customers —raised the bar for the scholarly publishing world as well. E-readers such as the Kindle have led scholarly publishers to adopt digital formats such as EPUB 3.1. E-readers have also highlighted the tension that scholarly publishers and technologists face between a mission-based imperative to support innovative works of scholarship and the increasingly sophisticated tastes of a readership whose expectations for a digital reading experience are shaped by Amazon and other players in the commercial world.

This project was largely concerned with the user-facing design of the digital monograph, but efforts to recalibrate the economic model for publishing digital monographs deserve a brief mention. Almost from the beginning, scholarly e-book projects have learned from forerunner digital scholarly projects the importance of funding ongoing maintenance and technical development—costly activities that arguably had no analog in the print-only age. The Gutenberg-e program and the ACLS History E-Book project (later the ACLS Humanities E-Book project) both launched in the 2000s with a subscription model for their collections of e-books—an access plan that, although not unprecedented, was certainly novel at the time when compared with the firm purchase model for print books.[22] As scholarly journals increasingly offer open-access publishing options—typically models in which an article’s author pays an agreed-upon cost of publication up-front using funds from his or her research grant, a subsidy from the author’s university, or other means—there is growing interest in extending that model to e-books. Knowledge Unlatched, a not-for-profit partnership, effectively serves as a negotiating agent between two parties: scholarly publishers that are willing to make new, accepted titles openly available if sufficient publication funding can be found, and academic libraries that are willing to band together to subvent the publication of these titles. Rebecca Kennison and Lisa Norberg, two well-known figures in the American scholarly communications community, have gone a step further, calling for a more coordinated migration of journals and books in the humanities and social sciences to an open-access model, starting with content published by scholarly societies.[23] As interest continues to grow in extending the open-access publishing model from journals to scholarly books, publishers and librarians are working to understand better the up-front costs that must be covered in order to operate a self-sustaining open-access monograph publishing program—costs that have been complicated to pin down. A study of Indiana University and the University of Michigan and another covering a cohort of several other university presses show how deeply nuanced the cost-accounting activity for scholarly books is in practice.[24] As authors, publishers, and librarians continue to innovate on the formatting and display of digital scholarly monographs in the years to come, so too will the scholarly community seek to develop new economic models for supporting a vibrant monograph publishing system.


Bibliography

Design Thinking and Lean Start-Up Resources

  • Brown, Tim. Change by Design: How Design Thinking Transforms Organizations and Inspires Innovation. New York: HarperCollins, 2009.
  • Cross, Nigel. Design Thinking: Understanding How Designers Think and Work. New York: Berg, 2011. https://doi.org/10.5040/9781474293884.
  • Gray, Dave, Sunni Brown, and James Macanufo. Gamestorming: A Playbook for Innovators, Rulebreakers, and Changemakers. Sebastopol, CA: O’Reilly, 2010.
  • Kelley, Tom, and Jonathan Littman. The Art of Innovation: Lessons in Creativity from IDEO, America’s Leading Design Firm. New York: Currency/Doubleday, 2001.
  • Knapp, Jake, John Zeratsky, and Braden Kowitz. Sprint: How to Solve Big Problems and Test New Ideas in Just Five Days. New York: Simon and Schuster, 2016.
  • Maurya, Ash. Running Lean: Iterate from Plan A to a Plan that Works. Sebastopol, CA: O’Reilly, 2012.
  • ______. Scaling Lean: Mastering the Key Metrics for Startup Growth. New York: Portfolio/Penguin, 2016.
  • Osterwalder, Alex, Yves Pigneur, Tim Clark, and Alan Smith. Business Model Generation: A Handbook for Visionaries, Game Changers, and Challengers. Hoboken, NJ: Wiley, 2010.
  • Osterwalder, Alex, Yves Pigneur, Greg Bernarda, and Alan Smith. Value Proposition Design: How to Create Products and Services Customers Want. Hoboken, NJ: Wiley, 2014.
  • Ries, Eric. The Lean Startup: How Today’s Entrepreneurs Use Continuous Innovation to Create Radically Successful Businesses. New York: Crown Business, 2011.

Authors

Laura Brown is the Executive Vice President of ITHAKA and Managing Director of JSTOR, a research and teaching platform for the academic community. Before moving to JSTOR, Laura served as the Managing Director of Ithaka S+R, the strategy and research arm of ITHAKA dedicated to helping the scholarly community make a successful and sustainable transition to digital and network technologies. Laura is the coauthor of the ITHAKA report “University Publishing in a Digital Age.” Prior to joining ITHAKA, Laura was the President of Oxford University Press, USA. She has served on the boards of the University of Pennsylvania Libraries and the MIT Press and currently serves on the boards of Yale University Press and the Gordon Parks Foundation.

E-mail: laura.brown@ithaka.org

Alex Humphreys is the Director of JSTOR Labs at ITHAKA. The JSTOR Labs team works with partner publishers, libraries, and scholars to create experimental tools for research and teaching. Prior to starting JSTOR Labs, Alex led the effort to modernize JSTOR’s technology platform. Alex built an award-winning publishing platform for Oxford University Press, Inc., and has 20 years of experience creating digital tools, products, and businesses.

Twitter: @abhumphreys.

E-mail: alex.humphreys@ithaka.org

Matthew Loy is the Strategic Initiatives Manager for JSTOR.

E-mail: matthew.loy@ithaka.org

Ronald Snyder is the Director of Research and Development for JSTOR Labs at ITHAKA. Ron has held various positions in his 12 years with ITHAKA/JSTOR, including his most recent role leading technology R&D for the JSTOR Labs team. Prior to joining ITHAKA, Ron worked in the aerospace industry for nearly 20 years in a variety of software engineering and leadership positions.

E-mail: ronald.snyder@ithaka.org

Christina Spencer is ITHAKA’s Manager of User Research. Her role is to ensure that deeply meaningful research concepts and activities are engrained within product discovery teams and within ITHAKA at large. This is done by setting research strategy, directing user research staff, and conducting research firsthand.

E-mail: christina.spencer@ithaka.org


    1. Some of these are explored in a landscape review found in appendix B.return to text

    2. Kizer Walker et al., “Report of the Collection Development Executive Committee Task Force on Print Collection Usage, Cornell University Library” (Ithaca, NY: Cornell University Library, 2010), 2, http://hdl.handle.net/1813/45424.return to text

    3. Alex Verstak et al., “On the Shoulders of Giants: The Growing Impact of Older Articles,” working paper (Google Inc., Mountain View, CA, 2014),http://arxiv.org/pdf/1411.0275v1.pdf.return to text

    4. Roger C. Schonfeld, “Stop the Presses: Is the Monograph Headed toward an E-Only Future?” (New York: Ithaka S+R, 2013), 6,http://www.sr.ithaka.org/blog-individual/stop-presses-monograph-headed-toward-e-only-future.return to text

    5. While this project deals with efforts to improve the visual presentation of the monograph as an e-book, it does not focus on important but related issues around accessibility for impaired and disabled readers. In the United States, purveyors of digital content are required to meet certain standards for displaying text online in order for that content to be eligible for purchase or licensing by public institutions. This project started with the assumption that if the working group’s recommendations result in any full-scale changes to the way that scholarly books are displayed on JSTOR or other platforms, those changes will need to be consistent with government requirements and other best practices around accessibility.return to text

    6. For more information about design thinking and lean start-up product development methodologies, see the bibliography.return to text

    7. This activity, which is often used in the product development methodologies nodded to earlier, is sometimes called a “design studio” or an “8 x 8” (a variation in which the designers are asked to sketch eight designs in eight minutes).return to text

    8. Martin Eve has described another kind of research tool leveraging the references found in books. See Martin Paul Eve, “A Research Tool I Want (but Probably Won’t Get): Cross-Reference/Intersect Bibliographies of Books and Articles” (blog entry, June 3, 2014), personal website, https://www.martineve.com/2014/06/03/a-research-tool-i-want-but-probably-wont-get-cross-referenceintersect-bibliographies-of-books-and-articles/.return to text

    9. This concept, and the quotation, comes from feedback provided on an early draft of this article. On Gabriel Harvey’s “wheel,” see Lisa Jardine and Anthony Grafton, “‘Studied for Action’: How Gabriel Harvey Read His Livy,” Past and Present, no. 129 (1990): 46–48; http://www.jstor.org/stable/650933.return to text

    10. David M. Blei, “Probabilistic Topic Models,” Communications of the ACM 55, no. 4 (2012): 77–84, doi:10.1145/2133806.2133826.return to text

    11. Jane Aikin, “Valley of the Shadow: The Civil War on Internet,” Humanities: The Magazine of the National Endowment for the Humanities 18, no. 2 (March–April 1997), https://www.neh.gov/humanities/1997/marchapril/feature/valley-the-shadow.return to text

    12. Thomas J. Brown, “The House Divided and Digitized: Review of Valley of the Shadow: Two Communities in the American Civil War, Part 1: The Eve of War by Edward L. Ayers and Anne S. Rubin,” Reviews in American History 29, no. 2 (June 2001): 210.return to text

    13. Robert Darnton, “A Program for Reviving the Monograph,” AHA Perspectives on History (March 1999): n.p., https://www.historians.org/publications-and-directories/perspectives-on-history/march-1999/a-program-for-reviving-the-monograph.return to text

    14. Patrick Manning, “Gutenberg-e: Electronic Entry to the Historical Professoriate,” American Historical Review 109, no. 5 (December 2004): 1507.return to text

    15. Kate Wittenberg, “The Gutenberg-e Project: Opportunities and Challenges in Publishing Born-Digital Monographs,” Learned Publishing 22, no. 1 (January 2009): 40. Ms. Wittenberg oversaw the Gutenberg-e program at Columbia University Press and is now our colleague as managing director of the Portico digital preservation service.return to text

    16. Marc Parry, “Free ‘Video Book’ from MIT Press Challenges Limits of Scholarship,” Chronicle of Higher Education (February 20, 2011), http://www.chronicle.com/article/Free-Video-Book-From/126427/.return to text

    17. “WVU Receives $1 Million Grant from Mellon Foundation for First-of-Its-Kind Digital Publishing System,” WVUToday (February 3, 2015), http://wvutoday-archive.wvu.edu/n/2015/02/03/wvu-receives-1-million-grant-from-mellon-foundation-for-first-of-its-kind-digital-publishing-system.html. The American Association of University Presses has assembled a helpful roster of recent multimedia-enhanced book publishing initiatives analogous to the one at West Virginia University. It is available at http://www.aaupnet.org/aaup-members/news-from-the-membership/collaborative-publishing-initiatives.return to text

    18. Lesley W. Jackson, “NetLibrary (Review),” Journal of the Medical Library Association 92, no. 2 (April 2004): 284-85, https://www.ncbi.nlm.nih.gov/pmc/articles/PMC385321/.return to text

    19. Michael Kelley, “New Ebook Platforms Target the Scholarly Monograph,” Library Journal (January 28, 2011): n.p., http://lj.libraryjournal.com/2011/01/technology/ebooks/new-ebook-platforms-target-the-scholarly-monograph/.return to text

    20. Kathleen Fitzpatrick, Planned Obsolescence: Publishing, Technology, and the Future of the Academy (New York: NYU Press, 2009), 2, http://mcpress.media-commons.org/plannedobsolescence/external-reviews/.return to text

    21. The “Academy Scope” project of the National Academies Press, which is the inspiration for the cross–university press UPScope initiative, can be viewed at https://www.nap.edu/academy-scope/#top-downloads.return to text

    22. John B. Thompson, “U.S. Academic Publishing in the Digital Age,” in A History of the Book in America, vol. 5, The Enduring Book: Print Culture in Postwar America, eds. David Paul Nord, Joan Shelley Rubin, and Michael Schudson (Chapel Hill: University of North Carolina Press, 2009), 372, http://www.jstor.org/stable/pdf/10.5149/9781469625836_nord.29.pdf.return to text

    23. Rebecca Kennison and Lisa Norberg, “A Scalable and Sustainable Approach to Open Access Publishing and Archiving for Humanities and Social Sciences: A White Paper,” KN Consultants, April 2014, http://knconsultants.org/wp-content/uploads/2014/01/OA_Proposal_White_Paper_Final.pdf.return to text

    24. Carolyn Walters and James Hilton, “A Study of Direct Author Subvention for Publishing Humanities Books at Two Universities: A Report to the Andrew W. Mellon Foundation by Indiana University and University of Michigan” (September 15, 2015), https://deepblue.lib.umich.edu/bitstream/handle/2027.42/113671/IU%20Michigan%20White%20Paper%2009-15-2015.pdf; Nancy Maron, Christine Mulhern, Daniel Rossman, and Kimberly Schmelzinger, “The Costs of Publishing Monographs: Toward a Transparent Methodology” (New York: Ithaka S+R, 2016), doi:10.18665/sr.276785.return to text