EPUBs are an experimental feature, and may not work in all readers.
Traversing The Book of Mpub: an Agile, Web-first Publishing Model
Skip other details (including permanent urls, DOI, citation information)
This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact firstname.lastname@example.org for more information. :
For more information, read Michigan Publishing's access and usage policy.
Simon Fraser University, Sept 2010
In the twenty-first century, content normally lives on the web. But what would a web-based book publishing environment look like? In spring 2010, graduate students at Simon Fraser University created The Book of MPub, an end-to-end, web-first book publishing project. The re-visioning of the book as a web-born entity presents enormous opportunities for publishers to push the operational, expressive, and social horizons of their businesses. We have identified four key concepts which shape a modern book publishing approach: the concept of an agile publishing methodology; the centrality of online content management systems; leveraging the web's HTML markup as a way of achieving an XML-based workflow; and the radical reconfiguration of promotion and marketing.
In the twenty-first century, content is born digital. It becomes fluid in the network, in the universal solvent that is digital media, where every representation can be reduced to bits and transformed and recombined indefinitely. Our society faces a media landscape utterly unlike anything previous generations could imagine.
Books—speaking in the traditional printed sense—are not to be displaced, or replaced, but must now co-exist with myriad other forms in a vast digital ecosystem. It is the market and larger cultural place of printed books that is evolving, as it has done before and as it will again. The book itself (and we might consider the ebook a variation on the theme, rather than an end in itself) faces a revolution, but that revolution reflects changes in the larger media landscape within which the book exists. The book itself is a durable and valuable cultural object in transition to a newly realized role.
In flux is the central role of the book as unit of cultural experience. For centuries the book has held a place of privilege, most perfectly embodying the authority and permanence of cultural expression in literate society. Our very understanding of literacy has been wrapped up with the book and its centrality. Our educational institutions have revolved around the book for generations; each and every one of us has gone to school for a decade or more, largely to internalize the ways of the book. As with the proverbial fish in water the very idea of standing apart from our collective relationship to the book, being able to see it in new light, is incredibly difficult. And yet this new light is precisely what shines on the book today. We find ourselves in a world where a good part of the cultural experience traditionally associated with books now typically and normally exists online. The web and other Internet-based media have taken over a major part of our informational needs, and of a vast swath of literate expression. More people are reading and writing more text today than at any other time, though very little of it is in books—or even “published” in a traditionally recognizable form.
The book does not stand apart from any of this. Nor, we believe, is the book itself threatened by these shifts. But what demands acknowledgment is that the print book no longer exists at the centre of our media landscape. The book’s function, significance, and reason for being must today be evaluated in the light of this vastly broader, multifaceted landscape. Publishers who appreciate this evolving context for books and book publishing surely have an advantage over those who would continue to operate in the twentieth century.
What this paper proposes—and what we have been most interested to investigate at Simon Fraser University’s (SFU) Canadian Centre for Studies in Publishing—is the development of a book publishing environment native to the digital landscape of the twenty-first century: that is, a book publishing environment that starts with the web. Our investigations began a few years ago with research into how to build bridges between web- and print-publishing software. In spring 2010, we undertook a complete web-first publishing project, bringing a book from initial conception to production and fulfillment almost entirely in a web-based context: The Book of MPub served as both a proof of concept for emerging ideas in editorial and production workflow and an exploration of entirely new possibilities for a book born on the web.
In reflecting on the creation of The Book of MPub, we have identified four significant concepts that shape a modern book publishing approach: (a) that an agile publishing methodology is fundamental to a web-based operation; (b) that centralized, online Content Management Systems (CMS) form the core of twenty-first-century editorial, production, and marketing activities; (c) that the web’s HTML is a viable way of achieving an XML-based workflow; and (d) that a web-first publishing model opens up the radical reconfiguration of promotional and marketing efforts. More generally, the re-visioning of the book as a web-born entity, as fluid and socially dynamic as other web-based media, presents enormous opportunities for publishers to push the operational, expressive, and social horizons of their businesses. The following is an exploration of some of these dynamics.
What is “agile”? The term refers to a decade-old movement in the software development world based on the observation that traditional, top-down engineering methodologies don’t work as well in the digital realm. Agile methodology eschews the prescription for detailed requirements-gathering, front-loaded design, and strictly managed implementation in favor of rapid prototyping, iterative development, and constant testing. Since designers and developers cannot possibly know everything necessary at a project’s outset, it pays to allow the project to “fail faster”—that is, to incrementally correct and learn about a problem domain through iteratively engaging with it. As Wikipedia’s article on Agile Software Development puts it, this means “regular adaptation to changing circumstances.”
Such an approach has considerable appeal in the world of digital media, where fluidity and change are the norm. In a post-industrial environment, where heavy “tooling up” is not required, it begins to make sense to draft and revise, following a lightweight, iterative approach that redefines its own goals in continual response to feedback. This break from traditional engineering virtues is perhaps because digital technology is information technology; it is more akin to writing than it is to building bridges. What, then, can the world of publishing—currently in transition from an industrial manufacturing paradigm to a digital, networked one—learn from the software engineers?
In recent years, the agile model has been applied to publishing. Firms such as Pragmatic Programmers have elaborated a blazingly fast and efficient editorial and production model for technical books, which they describe as an “agile publishing methodology.” In a 2006 article, XML consultant Michael Fitzgerald outlined an agile, web-based model for the exchanges between authors and editors. The XML toolmaker MarkLogic equates agile publishing with the ability to easily re-use existing content. Much of this discourse draws on the popularity of the 2001 “Manifesto for Agile Software Development”—in fact, Pragmatic Programmers founder Andrew Hunt was a co-author of that document, and much of their book list deals with topics in agile development.
We argue, however, that by now agile is not a methodology. At a more fundamental level it is simply the way the online world works—the world brought into being by the Internet and World Wide Web. The cost of creation in a post-industrial, post-manufacturing model is so low that it simply makes more sense to iterate things into existence. At the risk of making a baldly technologically determinist argument, we suggest that the affordances of digital, networked media profoundly encourage low-risk, socially engaged, iterative creative expression. The rapid rise of blogging and participation in social networks is perhaps the most broadly obvious manifestation of this pattern. We are witnessing a renaissance of ephemera, but in an environment that forces us to re-evaluate what “ephemeral” means—for today’s digital artifacts do not disappear. What can a book publisher take from this?
If agile methods are both facilitated and encouraged by the use of online media—as seems to be the experience in software engineering—then every industry, every human activity that conducts itself via the web is touched by the agile idea. The question for book publishers, then, may not be whether or not to adopt agile methodologies, for the very idea of a “methodology” implies a sort of instrumental relationship. Rather, if “agile” simply describes the way we work on and with the web, the choice is whether we will work in a twenty-first-century mode or a twentieth-century mode.
The Book of MPub as an Agile Publishing Project
What does an agile publishing model look like? The Pragmatic Programmers were the first to offer up such a thing, but as they produce books about agile software development, their model may not be generalizable. Instead, if we take “agile” to describe a web-native way of working, which is accessible to everyone, an agile publishing model is one that starts with the web and conducts itself online where possible. The Book of MPub project set out to prototype such a model.
The Book of MPub was born as a course project assigned to the 2009–2010 cohort of SFU’s Master of Publishing Program. It began simply enough: the project team would create, edit, and produce an anthology of the best of the cohort’s papers on technology and the future of publishing. The anthology would be created and edited in an online space. The resulting book would then be produced and distributed simultaneously in multiple formats: print, PDF, EPUB, and web-native. This assignment was issued in late February 2010, but in keeping with a methodology which values “regular adaptation to changing circumstances,” the project scope was re-defined by the project team to include an open peer-review process, a formal editorial workflow, and a promotion and marketing plan. The Book of MPub manifests the following agile qualities:
- It was born on the web; it has spent most of its life on the web, and it continues to live on the web, regardless of the more “bookish” forms that have emerged from it.
- It was created collaboratively: collaboratively written (19 people); collaboratively reviewed (approximately 40 people); collaboratively edited and designed (6 people).
- It was produced using simple, open web technologies that made the creation of different versions and renditions a mostly mechanical process.
- Its editorial and production workflow was cyclical, rather than linear. Edits were made to the text even on the day the book went to press, and there were more edits after it went to press. As a result, the finality of the work is exploded; its unifying identity is not its self-identical physical form, but its brand presence and its connectedness.
- It existed not only in multiple “formats,” but also, following on the previous point, in multiple discursive contexts online, including blog, Facebook, and Twitter. Book content was not re-purposed to these media, but rendered differently for each.
- It was fast: only nine weeks start to finish.
The details of how this project was planned and undertaken further demonstrate the possibilities of an agile model. We began with the web, assembling and managing content online.
Web Content Management
Book publishers in 2010 struggle to produce ebook formats and cope with digital distribution. Scanning, optical character recognition (OCR), and outsourced XML tagging are elements of a tortured but common process of getting usable digital content from pre-existing books. A related process, somewhat simpler but still the source of considerable hair pulling, seeks to export or convert ebook or web content from existing print layouts in Desktop Publishing (DTP) software. It is not easy—nor, we argue, is it ever likely to be easy—to pull fluid, digital formats from print- and page-oriented production processes. The reason is straightforward: in the DTP paradigm, there is no separation of content from formatting; the two are constitutionally wedded. As a result, the business of separating content from print-based formats is difficult, imprecise, and kludgey. A tool like Adobe InDesign is a stunningly powerful print layout tool, the heir to three decades of evolution of layout software. But it has no robust facility for managing content independently from the way it will appear on the printed page.
If this basic problem is well enough understood, so is the solution. Modern content-management software is based precisely on the separation of content and formatting. Content is created and stored in a way that allows it to be written and edited quickly, and then combined with templates and stylesheets that define presentational format—typically for the web. Modern web-based content management systems, ranging from complex enterprise asset-management systems to simple blogging or collaborative wiki tools, are mature, well-supported toolkits that support large web publishing teams and a division of labor between editors and designers. But because the foundational architecture separates content from how it is presented, there is no reason why such tools are limited to web publishing. Web Content Management Systems (CMS) are in principle usable in any kind of production environment, including print.
So, rather than trying to pull digital formats out of print-production tools, why not go the other way: to integrate print-production tools into web CMS, since the latter assume a more robust separation of concerns?
Modern web-based CMS tools provide key functionality for publishing teams. A CMS centrally maintains content, so writers, editors, and other contributors come to the content, rather than proliferating file versions as a text is passed back and forth (resulting all too often in nasty version conflicts). A CMS handles version control and change tracking centrally, making it possible to see who made what change, when—and to revert to an earlier version if necessary. CMS tools often provide features for managing large teams, handling which individuals have permission to view, edit, or sign off on a work. Tagging, metadata, and categorization features allow for the organization, archiving, and re-use of large bodies of material. Most importantly, by holding content in an open, transparent format, a CMS offers much better potential for future-proofing than our usual proprietary production toolkits.
In traditional print production environments, only the largest and best capitalized publishers had access to this kind of CMS tool: large newspapers, magazine conglomerates, and multinational book publishers could afford this kind of “enterprise” content management. But web publishers have been developing such tools on a free and open-source basis for a decade or more, providing similar functionality to high-end press systems at little cost—and such tools are quite ubiquitous online today. Because a good CMS keeps the content separate from its presentation and formatting attributes, it is entirely possible to use a web CMS in service of print publishing.
WordPress as Simple CMS
With The Book of MPub, we set out to prototype a kind of “simplest possible” web platform for publishing. This led us to choose WordPress as a Content Management System. Though originally designed as a open-source blogging platform, current versions of WordPress have robust content management features: support for multiple contributors, fine-grained revision control, and well-developed content editing tools. But while WordPress’s features were attractive, it is the simplicity and ubiquity of the system that is its real advantage—millions of bloggers and web publishers use WordPress, and the tool is supported by nearly all hosting services. Furthermore, its open-source development community is likely one of the largest of its kind, with many thousands of developers contributing to either the core software or optional plug-in modules. There are many web CMS with richer and more sophisticated feature sets than WordPress; but if we could turn this relatively simple tool into a serious publishing platform, then it would be possible to use any web CMS in its place.
As a CMS, WordPress shares many core design features with more complex or sophisticated systems. It manages access, handles multiple workflow states, tracks versions and changes, and has a clean separation of content from formatting. Notably, its “visual editor” environment is an open-source tool called TinyMCE, found within dozens of other web-based systems. TinyMCE is a quasi-WYSIWYG editing environment that presents writers and editors with a user interface not unlike a word processor: formatting options are accessed by a set of buttons and pull-down menus. As Stephen Ramsay of the Anthologize project puts it:
Simple editing frameworks are good for writers. People work the way they like to work, but WordPress at least tries to imagine a world in which the mammoth, lumbering word processor—which repeatedly confuses the creation of prose with the process of generating formats—is not obligatory for anyone who wants to get their ideas out there.
Simpler than a word processor, but still recognizable to writers and editors, TinyMCE is one part of what makes a CMS like WordPress usable for much more than producing blogs and websites. At its heart is a straightforward and consistent content management system based on XHTML; to this we now turn.
XHTML as a Gateway to XML-based Workflow
Over the past few years, publishers have been told again and again that an XML-based publishing workflow is essential to doing business in the twenty-first century. But while no publisher has escaped this call to action, very few are in a practical position to actually make XML a core part of how they produce books.
Why is XML so much talk and so little action? To begin, tooling up for an XML-based editorial/production workflow is complicated, expensive, and alien to the way many print publishers think about their work. While the value proposition of XML—and its precursor, SGML—has been clear for decades, the practicalities of XML have remained the province of publishers with colossal content management issues: in the aerospace and pharmaceutical industries, in higher-ed textbooks and technical publishers like O’Reilly Media. For small trade or scholarly publishers, the advantages of XML have failed to overcome the considerable hurdle of getting into it: acquiring the right software, training staff, changing the way both editors and production staff do their jobs. In an environment where robust DTP tools like QuarkXPress or Adobe InDesign make such a good fit for the apparent tasks at hand, we’ve literally seen generations of publishing staff organize themselves around these toolkits. Publishers may say, “If it ain’t broke, don’t fix it,” rather than re-tooling for XML.
The solution to this impasse, we believe, is the web. The web was originally conceived, twenty years ago, as an SGML application, and SGML/XML standards are still at the heart of web production. The trouble is that over two decades, web publishers have largely ignored the rigor of a ‘real’ SGML or XML-based publishing system, opting instead for the development of web browsers that were (and are) extremely error-tolerant: capable of reasonably good rendering of sloppy, idiomatic markup. As a result, by the end of the 1990s, serious SGML/XML professionals scoffed at the open web as a morass of “tag soup.”
What’s changed in the past decade is the widespread adoption of Content Management Systems on the web. Rather than web pages being hand-crafted as they were in the 1990s, most web development today is done via a CMS, which properly separates content from format, and maintains vastly better markup consistency throughout. Web CMS were designed to address the same virtues—consistency, maintainability, separation of concerns—that industrial XML toolsets address.
A modern web CMS manages content in valid XHTML. This markup language, designed for web pages, lacks the semantic richness required for, say, producing Boeing Corporation’s aircraft maintenance documentation. But it is, we argue, sufficient for a general trade publisher. XHTML has evolved into a solid, generic markup language for prose structures, and put to that use is a perfectly respectable XML document type for non-technical, non-specialized publishing.
Why wouldn’t a publisher want to use a “real” XML document type? Why settle for XHTML—with its dodgy provenance and lowest-common-denominator semantics? The reason is tools: where XHTML wins in a big way is in software support. Because web content management tools have been around online for a decade or more, and used by literally millions of people, they are among the most robust—and certainly most user-friendly—XML tools around. Industrial-strength XML document types were originally designed with specialized, technical documentation teams in mind; adoption by small presses means a wholesale reconfiguration of core competencies. By contrast, adopting an XHTML-based XML workflow can be done by adopting a web content management tool—even one like the ubiquitous WordPress. Based on our informal show-of-hands surveying, most people in the Canadian publishing industry already know WordPress in one context or another (as a blogging platform, or as a tool for building a promotional website). What that means is that these people are already capable of working with XML-based workflow.
Another major reason XHTML is worth considering is discoverability. Marking content in an industry-specific XML document type means relying on the search tools provided either by your local editing and management toolkit, or via an industry-specific indexing service. An XHTML-based content base, on the other hand, can be indexed by Google—or any other web-based search tool. In the old days, the ability to search at all was a good thing, but today, in the twenty-first century, worldwide discoverability is an essential consideration for a publisher.
The advantages of XHTML go beyond discoverability. Most common ebook formats, like the IDPF’s EPUB and Amazon’s Kindle, are either directly based on or closely related to XHTML, so it is a small matter to repackage this content in ebook formats. Producing EPUB files from XHTML content requires no actual conversion, because EPUB is simply a repackaging of XHTML. By contrast, consider the process of producing EPUB content from a print-oriented tool like MS Word or Adobe InDesign, in which content structures must be interpreted by software and re-written as ‘equivalent’ XHTML content, a process fraught with problems, because print tools were never designed to produce structured XML content in the first place.
In effect, the web can be seen as a lightweight XML system, one that provides the proverbial 80 percent of the benefit of a “real” XML workflow but with only 20 percent of the complexity and investment. For straightforward prose structures, as found in the vast majority of fiction and general non-fiction titles, XHTML is an entirely reasonable markup language; it lacks some of the specialized semantic expressiveness of more complex industry-specific markup, but it gains by being supported by an immense array of software tools already in use by millions of people. It also succeeds by being the native markup language of most e-publishing formats, both online and off.
Producing Print from the Web
The preceding discussion of online publication formats may be moot, however. Ebooks are still, in 2010, a single-digit portion of the overall book market; the vast majority of publishers sees print books and their core business, and will do so for the foreseeable future. So our consideration of a web-first publishing workflow is worth little if it does not provide a high-quality print-production path. Book publishers are not likely to consider any of the benefits discussed so far if it results in a loss of quality or efficiency in print production.
If we treat XHTML content as real XML—rather than merely web pages—then a high-quality print production option opens up. Any XML-marked content can be transformed into any other marked content via a mature technology called XSLT (Extensible Stylesheet Language for Transformations); this is to say that XML markup can be disassembled, re-tagged, and reassembled programmatically with considerable precision. Unlike converting proprietary word-processing or DTP formats, which requires a fair bit of interpretation and guesswork on the part of conversion tools, the openness and consistency of XML content means transformations can be approached with confidence. If we have consistently marked XHTML content in a CMS, all we need is a print-oriented XML markup to transform into.
Adobe provided such a format with the release of the Creative Suite 4 in 2008. The version of InDesign in CS4 (and CS5) supports an alternative file format called IDML, which is complete representation of an InDesign document—content, layout, master pages, style definitions, and so on—in XML. IDML is not the kind of XML markup anyone would want to work with natively, but it is easy enough to write a transformation from XHTML to IDML, allowing web-based content to flow straightforwardly into Adobe InDesign for print layout. The result is direct integration with existing print production processes. Content can begin on the web, and go to print when it’s appropriate.
At SFU last year, we developed an open-source XSLT script—called “Ickmull”—to do this, which we have now used in a variety of projects, from book production to scholarly journals and one-off document production. It is the tool we used to produce print and PDF versions of The Book of MPub.
Our Ickmull transformation script actually produces ICML—a subset of the full IDML language—which represents just the ’story’ and styling information in an InDesign document. Producing ICML is technically much simpler than the complete IDML representation of an InDesign file, but much more importantly, we wanted a semi-automated solution rather than a fully automatic one, in order to preserve the role of existing design and production staff. By transforming XHTML content to the ICML subset, we handle many design elements—such as master pages and folios, paragraph and character style definitions, and the large-scale architecture of the publication—natively in InDesign. What flows in via XML is merely the content, or “story,” as InDesign calls it. ICML is, in fact, the file format for Adobe’s InCopy software, which effects this same division of labor (between editorial and design) for newspaper offices, in which multiple stories must flow into a single layout without causing a production bottleneck. Ickmull leverages the linkage between InCopy and InDesign, but instead of using InCopy, we use WordPress—or any web CMS.
In practice, production staff prepares an InDesign template that defines master pages, common design elements, and style definitions—this is done natively in InDesign, the way production staff typically workstaff typically works. Content is then transformed and the ICML files are “placed” in the template. As they flow in to the pre-defined master pages, they pick up styling and pagination rules from InDesign, and the result is a near-perfect layout. Designers and production staff can then work directly in InDesign to do copyfitting and whatever other layout work they desire. Furthermore, leveraging the link that InDesign maintains to an InCopy file, the XHTML can be updated, re-transformed, and the content in the InDesign layout updated in place, without needing to re-flow.
The implication of such a workflow is that content ceases to be ‘managed’ in InDesign; ideally, edits as small as a fixed typo can efficiently be made in the web CMS, the content re-transformed and updated in place. The InDesign file itself then becomes relatively disposable, since it is not the canonical source of the content, and can be re-produced easily at any time. InDesign instead takes its proper place as an output tool—producing print layouts and typography with considerable precision—which is what it was designed to do. In production of The Book of MPub, content first appeared in InDesign a few weeks before the final book was released; it was flowed and assembled, then updated dozens of times, chapter by chapter, as the editorial process continued online, in parallel with the print production.
Reconfiguring Promotion and Marketing
If a book is written and edited in an online, web-based platform, it is but a short step to leveraging that platform for promotional purposes—as with the ease of repackaging web-based content as ebooks. If the book and everything to do with it are already living in a web CMS there is nothing (at least nothing mechanical) standing in the way of building an online promotional presence. We need not wait for marketing materials to be generated or converted or repurposed. Furthermore, a web CMS is not just a content management tool; it can also serve a community of interested readers, as an editorial and a promotional platform.
With The Book of MPub, we reached out to our community to make this work. The promotional process began in the editorial review stages of the book. Each chapter in the book was born as a paper written for SFU’s PUB802 course. The initial review of each paper—in this case both by the professor and by an assigned peer reviewer from within the class—was conducted online, via the commenting feature in our web CMS. Each article was initially read and reviewed by at least two commenters. After an initial editing pass, the articles were then put into wider circulation as a kind of “open peer review” process in which upwards of 50 reviewers were contacted from across the publishing industry across Canada and the United States, each invited to comment on a particular article. The results of this review were also gathered as WordPress comments. In many cases, article authors responded in the comment thread as well, as did project editors. The entire review process happened in the open, on the web, and visible to everyone.
Social media provided the next step; The Book of MPub editorial team also went to social networking applications Twitter and Facebook with brief excerpts from each article. As these social media messages were posted and re-posted (tweeted and re-tweeted), The Book of MPub began to develop an online following (over one hundred people). In several cases, the external reviewers who were invited to review particular articles also saw fit to use social media (and especially Twitter) to both announce their reviews and to re-tweet the excerpts and commentary from the book’s editors. This social media snowball effect resulted in a number of unsolicited reviews posted by people a step beyond those who were initially invited. In a few cases, debates even ensued within the WordPress comment threads.
While the initial driver of this web outreach was editorial review—and in due course each article was subsequently revised in light of comments gathered online—the promotional side effect was even more important to the project. By the time the book launched (only 4 weeks after the review invitations went out), an interested community of some hundreds of people were not only aware of the project, but actively interested, many having contributed to the project themselves.
It is worth noting that The Book of MPub was not conceived as a commercial project. We had no sales goals, and did not set a price for the book; our aim was entirely promotional. Within such a project there is no meaningful way to measure the effect of this promotional outreach on sales. What can be said, however, from a promotional standpoint, is that the web traffic for The Book of MPub outpaced web traffic to the MPub program website by as much as two to one over the first two months of the project’s existence.
The Book of MPub, once launched, was linked and available online in four formats: as a downloadable PDF, a downloadable EPUB ebook, as the original WordPress-based CMS (with the review comments in place as well as complete project meta-documentation), and as a growing list of links to print-on-demand providers, including Lulu.com, BookRiff.com, and Espresso Book Machines in different Canadian cities, where the print book was available. The proliferation of links to available versions of the book points to a final facet of agile, web-based publishing. First, the list of points of access to The Book of MPub is growing, and the resulting network of interlinkages is itself a key part of the book’s presence—as opposed to its availability being tied to a particular distribution partner or retailer.
Second, there is some—we would argue naturally occurring—variation between the different versions of the book. Most blatantly, the native WordPress-hosted version of the book contains far more content, from the review comments to the meta-documentation about the book’s publication process. More subtly, the printed versions of the book feature some copyfitting and pagination tweaks. The EPUB version has some technical differences resulting from the need to produce a perfectly valid EPUB file. Among the versions available from print-on-demand partners, there are differences in pagination, front matter, and cover.
How to account for these differences? Should we worry about them? First, some of the changes result from technical requirements in the production and distribution stages and could, theoretically, be reconciled across all versions. However, in the real world, the practical benefits of doing so are not worth the time and expense, even though the details are trivial. Reconciling all changes would be for the sake of theoretical perfection, not practical utility. More importantly, though, the proliferation of variations and differences among a ‘live’ and growing set of versions is a reasonable consequence of the propagation of the book through its various communities of interest. When interest in the book and its availability ultimately dies down, the book will become static. This is nothing new: traditional publishing follows a similar pattern, producing printings and editions and revisions of a title, though the world of the book maintains the mythology of a perfect, unchanging edition, outside of time. What has changed with the age of the web is that variations in a published text proliferate much more quickly and are more readily identified. We think this is an opportunity rather than a problem to be solved.
The Book on/and the Web
When we set out to build The Book of MPub, our only clear goal was to prototype an agile web-first workflow with multiple output formats. We soon found that the only limitations to the project were the size of our network and the time we had to produce it; we also found that by situating the content on the web first, the workflow grew naturally out of the book’s development needs. Incrementally, over the course of nine weeks, with the contributions of close to sixty people, we developed both a well-edited, polished-looking book and a smooth-running web-to-print workflow encompassing editorial, production, and marketing strategies. Because the investment costs for the technologies and processes we used were almost non-existent and they produced results instantly, the system was easily refined. Publishers are well used to taking advantage of economies of scale, using a one-size-fits-all workflow, but perhaps, in a web-first context, they should consider books methodologically independent, in that an appropriate workflow will emerge naturally to suit the text’s particular rhetorical situation.
A natural consequence of iterative production, multiple formats, and a vast network of contributors is that multiple versions of the text exist concurrently. In The Fluid Text, literary critic John Bryant suggests this intratextual tension is a point of interest more than a problem. Bryant suggests “the reality of a text is located not in its status as a thing but as an action, or rather transaction, between words and readers.” We hesitate to discuss post-structuralism in the pages of JEP, but on the web, on blogs and wikis, these transactions become more clearly evident as the text is imprinted with the interpretations and commentary of readers; the appeal of these formats is that content is dynamic and interactive. If the book is to survive as a durable and valuable cultural object, publishers must first understand the print book as an extension of a social, fluid, digitally native text at a given moment, and not as the definitive and fixed output of a linear development process.
The Book of MPub was born on the web, and its readers have encountered it in such varied formats as a blog, a print book, an email, a tweet, a print-out marked up with red pen. These formats are all The Book of MPub, as are the encounters, all of which have shaped the text in its current incarnation. As Sherman Young has written, “all technologies are systems: a combination of an object, the processes that created that object, and ideas about that object.” If a book is its contents, the process of building it, and the discourse around and with it, if a book is itself a technology, the only way to ensure its continued relevance is to build it to be autonomous and social and open-ended—to build it the web way.
We argue that the book is not threatened because a printed book is not the zenith of a text. Rather, the printed book exists as part of the book’s ecosystem, a lightweight machine built the same way we produce and process all ideas—through interactions, through iterations, through failure and continual refinement. We are faced not with the death of the book but with its apotheosis.
From Barthes 1971 essay, “From Work to Text,” available online at http://courses.wcupa.edu/fletcher/special/barthes.htm —though when we went to track down a more definitive source, we found—alas!—a different translation: “The Text is experienced only in an activity, in a production. It follows that the Text cannot stop (for example, at a library shelf); its constitutive moment is traversal (notably, it can traverse the work, several works).” – from Roland Barthes, “From Work to Text,” in The Rustle of Language (Berkeley: University of California Press. 1989), 56–64.
Canadian Centre for Studies in Publishing. http://www.ccsp.sfu.ca.
John W. Maxwell, Meghan MacDonald, Travis Nicholson, Jan Halpape, Sarah Taggart, Heiko Binder, “XML Production Workflows? Start with the Web,” Journal of Electronic Publishing 13, no. 1 (Winter 2010), doi: http://dx.doi.org/10.3998/3336451.0013.106.
The Book of MPub, by the SFU Master of Publishing Cohort of 2010, was released April 16, 2010. The book can be found at http://tkbr.ccsp.sfu.ca/bookofmpub — and by following links from the original source, to PDF, EPUB, and a variety of printed versions of the book.
“Agile Software Development,” Wikipedia, accessed August 12, 2010, http://en.wikipedia.org/wiki/Agile_software_development.
Dave Thomas & Andrew Hunt, “Agile Publishing Model,” Paper presented at the O’Reilly Tools of Change for Publishing Conference, New York, NY, February 24, 2010, http://www.toccon.com/toc2010/public/schedule/detail/13329.
Michael Fitzgerald, “The Emerging Art of Agile Publishing,” O’Reilly XML.com (March 08, 2006), Accessed July 29, 2010, http://www.xml.com/pub/a/2006/03/08/agile-publishing.html.
“Agile Publishing,” MarkLogic website, accessed August 5, 2010, http://www.marklogic.com/information/agile-publishing.html.
Manifesto Agile Software Development. (2001). http://agilemanifesto.org/.
Clay Shirky, in his book Here Comes Everybody: The Power of Organizing without Organizations (Penguin Press, 2008), makes an extended argument for the Internet’s facility for driving the cost of organizing communities to zero or near zero.
TinyMCE is the “visual editor” embedded within most modern CMS toolkits, including Wordpress, Drupal, Joomla, and others. It is released as free software. http://tinymce.moxiecode.com/.
Anthologize is a Wordpress plugin, first released in August 2010, which allows an editor to arbitrarily organize Wordpress content (posts, pages, etc.) into a serialized collection and export to various publication formats, such as PDF, EPUB, and TEI XML. http://anthologize.org
Stephen Ramsay, “Anthologize it,” Stephen Ramsay (blog), August 12, 2010, http://lenz.unl.edu/wordpress/?p=212
The “Start With XML” movement was spearheaded (largely as a series of conferences) by publisher O’Reilly Media and others in 2009. See Mike Shatzkin, “A New Project: ‘StartwithXML, Why and How,’” Paper presented to the BISG Annual Meeting, September 12, 2009, http://www.idealog.com/a-new-project-startwithxml-why-and-how
EPUB is the standard ebook file format developed by the International Digital Publishing Forum. http://www.idpf.org/
For a scholarly overview, see Maxwell et al. 2009. The Ickmull transformation software—composed as a free-licensed XSLT script—was developed at SFU’s Canadian Centre for Studies in Publishing with considerable help from Keith Fahlgren of Threepress Consulting Inc. The script is available at http://code.google.com/p/ickmull
On “open peer review” see Kathleen Fitzpatrick, Planned Obsolescence, a book forthcoming from NYU Press but available for open peer review at http://mediacommons.futureofthebook.org/mcpress/plannedobsolescence/
John W. Maxwell is Assistant Professor in the Master of Publishing Program at SFU. His research and teaching focus is on the evolution of practical publication technologies, the emergence of digital genres, and the history of digital media. Follow him on Twitter @jmaxsfu
Kathleen Fraser is a Master of Publishing student whose research has focused on the intersection of technological development and editorial practices. Kathleen is the editor at Hur Publishing and works at Caitlin Press. Follow her on Twitter @Kathleen_Fraser