    16.4 Cost data

    We developed a variety of sources of data in the online books evaluation project. We conducted surveys online, by mail, by telephone and in class. We also conducted individual in-person and telephone interviews of scholars and a number of focus groups involving users, potential users, and librarians. In this report, we focus on cost analyses and on Web data.[2]

    In a traditional print production environment, preparing texts for online access incurs an additional cost. We found an amazing range of estimates for this cost, from four cents per page to more than $2.00 per page, which works out to approximately $100 to $1000 per title. The range of cost is due to the enormous variation in the format and quality of source files from the publisher at the time, and in the conversion processes employed by various projects. Achieving the low-end cost requires a very standard and well-behaved PostScript source file. In addition, these figures include some unknown component of experimentation cost, as this project and others adapted to variations in input, and in desired presentation format.

    Table 16.1: Sample e-book production costs ($)
    1.51/pg. Conversion: OCR or SGML or HTML
    1.00/pg. Conversion: ASCII to HTML
    0.04/pg. Conversion: Postscript to PDF
    20.00/title Conversion management
    1.00/MB/yr. Server maintenance

    In Table 16.1 we present some sample electronic book production costs. One conversion route is from OCR (or from SGML) to HTML, and the other, somewhat less expensive route begins with ASCII and goes to HTML. Conversion from PostScript to PDF is done using software from Adobe and yields a cost of about four cents per page. Note that this process, which has been tested at the University of Pennsylvania, does not yield fully navigable HTML files, but yields PDF output only. Management of conversion is estimated to have cost about $20/title at Columbia. Maintaining books on the server is steadily less expensive, estimated by the end of 1999 to cost about $1/megabyte per year.

    A fully electronic production process (bypassing print) would be less expensive. Through conversations with scholarly publishers, we have been able to estimate that the potential savings for moving to online format, without paper would be about 10% at the plant (that is changes in typesetting costs) and perhaps an additional 15% in costs avoided for paper, printing and binding. Also eliminated would be costs associated with warehousing and shipping, which we did not attempt to estimate.

    On the other hand, there are offsets to these savings for online production. They include costs of customer service, continuing file maintenance, and migration. These latter, archival, functions are very important. A rational economic publisher will only maintain the file for a book as long as the discounted total expected future revenue from sales exceeds the total discounted projected cost of keeping the file. Thus, libraries cannot rely on publishers to maintain the files of books with very low demand, unless they are willing to pay service fees that cover the publishers' expenses.

    From our review of the literature we have prepared an estimate of life cycle costs to the library for online and paper books. These, projected over a thirty-year life cycle and discounted at a 5 percent real cost of money, are lower for online books. Our summary is shown in Table 16.2. The difference is essentially equal to the avoidance of the costs of managing circulation. In addition, long run costs for online books would likely be quite a bit lower as copy cataloging would prevail rather than the original cataloging experienced, and included in the costs, for this project. Original cataloging costs about $25 per title while copy cataloging would cost significantly less per title.

    Table 16.2: Estimated life cycle costs ($)
    Print Online
    Acquisition/Processing 47.00 39.00
    Storage/Maintenance 14.00 38.00
    Circulation 44.00 (included above)
    TOTAL 105.00 77.00
    NOTE: Calculated over a 30-year life, at a 5% discount rate.
    Design Considerations: Librarians

    We conducted focus groups with librarians to identify market and design features that they consider important in building a collection of online books. The first feature emerging is the ability to search across selected groups of titles. A second, rather technical issue is the existence of "stable, granular" URLs. Stable means that the URLs remain the same over time, or at least that the system does not have to be manually updated. Granular has to do with the level of specificity with which a user can access a book. In the Columbia approach to online books, an individual file corresponds to a chapter within a book. We found that librarians want good bibliographic control of online books, with direct linking from the catalog into the book. But they would also like to see usage data on individual titles in some standard form. This usage data can feed back to rationalize online book acquisition policies. Finally, librarians want to be assured that an online book system will support reliable migration to new platforms.

    Design Considerations: Scholars

    Both in-depth interviews and focus groups with scholars generated a somewhat different list of desired design features. Scholars would like to be able move directly into the online book via direct link from the online catalog. They would like to be able to define groupings of texts on the fly, and search across that collection of texts. They would like a comprehensive and detailed table of contents, with direct linking into the book (providing, in effect, analytic indexing). When images are a significant part of the text they would like to see browsable, linked, thumbnail images. They would like screens and displays supporting the ability to show two nonconsecutive pages at once, permitting comparisons. They would like to be able to see footnotes and text in parallel displayed on the same screen, even if the "footnotes" are actually endnotes. They would also like to see pagination matching the print version, not only for navigational bearings, but also because, frequently, the citation that led them to a book specified a particular page.

    Scholars would prefer that, whenever the collection contains the relevant material, references be hyperlinked directly into the cited material. They would also like to be able to link to a dictionary. They would like to be able to adjust fonts and formats for easier reading on screen. They would like to have annotation and highlighting capability that they could store with the book. They also expressed an interest in having the ability to share annotations on a single text.