Add to bookbag
Author: Deborah Lines Andersen
Title: The Google Library
Publication info: Ann Arbor, MI: MPublishing, University of Michigan Library
December 2004

This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact for more information.

Source: The Google Library
Deborah Lines Andersen

vol. 7, no. 3, December 2004
Article Type: Benchmark

The Google Library

Deborah Lines Andersen

December 2004

Benchmark: a standard by which something can be measured or judged. [1]

Organizing the World's Information

As the world continues to become more and more digital, it is not surprising that one of the biggest players in the digital market would start looking to new venues for information organization and access. Google announced on December 14, 2004 that it would form a partnership with Stanford University, the University of Michigan, Harvard University, the New York Public Library and Oxford University's Bodleian Library "to digitally scan books from their collections so that users worldwide can search them in Google."  [2]

Size and Scope of the Project

This project is an enormous undertaking. According to one source referenced below, the number of projected items is

  • Stanford University: all eight million volumes
  • University of Michigan: all seven million volumes
  • Harvard University: pilot of 40,000, out of 1.5 million
  • New York Public Library: pilot, expand to 20 million items
  • Oxford's Bodleian Library: one million public domain volumes.  [3]

This would suggest that over the course of the next several years Google and these libraries will scan and digitize over 36 million volumes. Even if each volume were 100 pages long this would be an extraordinary undertaking, costly in time, talents, and funds. Google and several of the sites have stated that books will not be damaged and, in fact, will not be scanned if they are too fragile to undergo the process. There have also been assurances that books will be returned to the shelves of their home libraries after this process.  [4] This will be an interesting exercise in bibliographic control. Inevitably there will be duplication across the libraries. Google will either pay for multiple scannings of the same work or staff will spend time deciding if works are indeed identical.

Books that are in the public domain will be scanned in their entirety. Volumes that are copyrighted will have sections scanned for public access while not infringing upon copyright restrictions. One would expect that when a volume becomes public domain it will then become available in its entirety. The legal mechanisms of creating partnerships with these libraries, assuring copyright compliance, and allowing access to the volumes during the process have been negotiated with the member libraries.  [5]

Stakeholders in the Process

As with all projects of this size and scope, there are several stakeholder groups that will be affected by the Google Library. Users, writers, publishers, and Google with its investors all have a very strong interest in this undertaking. Libraries and librarians will also be prominent members of this group.

Users, and Students in Particular

There is little question but that information users will find this new library a useful service, and that it will be used. Within university communities there are laments about students never using the library on site, choosing instead to work from their dormitory rooms. Given the pace of life, it is very tempting to not drive or walk to the nearest library, instead moving almost seamlessly from doing email (Gmail if one uses Google's own service) to doing research while sitting in front of a computer. One might expect that student papers and research will improve if they can actually use books that have been scanned online, and that have been peer reviewed by reputable publishers who pay attention to quality of information and presentation. Given that abstracts of copyrighted works will be available as well as public domain materials, individuals will be alerted to current materials that might be available thought their libraries. It is worth noting that students, undergraduates at least, have a tendency to use what is immediately available to them. Does this mean that they would select older material that is online rather than finding newer materials at the library? For some individuals it probably does.


Writers will enjoy greater visibility through Google. As is the case today, if one types in an author's name various press releases and reviews are indexed on the Google search screen. Additionally, one can look at purchasing information, new and used, in the paid advertising sidebars. One would expect that none of this will change, and authors will have even greater visibility through an enhanced online library.


Google's web site is very specific about what publishers can expect by joining this project. It states that publishers will:

  • Increase their books' visibility at no cost
  • Attract new readers and boost book sales
  • Drive qualified traffic to their website
  • Earn new revenue from Google contextual advertising.  [6]

Google and publishers are all interested in getting more people to pay attention to the information they provide. This seems like an ideal partnership that builds on the strong and increasing presence of Google in the market as well as publishers' need to advertise and sell their books. It remains to be seen if, at some day in the future, publishers decide to skip the paper altogether and simply make their books available online through Google. Although Google presently does not charge its users, it seems quite plausible that we could move to payment for access, working nonetheless to avert the copyright disaster seen in the case of Stephen King's online, serialize novel, The Plant.  [7]


Google has developed its market share through selling of advertisements and became a publicly offered stock in 2004.  [8] It maintains a website on investor relations.  [9] Google is the stakeholder in this information organization and access arena. Notably, Microsoft is probably the prime competitor at this time. Although Google has the apparent advantage, there has been speculation that Microsoft will fight hard to create the standard for the industry and force Google out of the market.  [10]

Libraries and Their Users

Universities and their libraries will also be affected by this project. Google has stated that its mission is "to organize the world's information and make it universally accessible and useful."  [11] Librarians were traditionally the organizers of information, assigning subject headings and call numbers, and arranging books and journals on the shelf so that users would have access to their collections. As the library world has moved to digital journal collections, where access to information does not necessary mean ownership of paper copies, librarians have continued to be the mediators in the information process. University libraries subscribe to services (usually at a hefty annual fee) that provide access (e.g., Wiley's Interscience) and make that information available online to campus users. How will Google's Library change that process?

To date Google has not talked about adding paper-only journals to its mix of online items, but it does not take a lot of imagination to see this as a logical step in its global information process. Online journals, such as the Journal of the Association for History and Computing, already find themselves indexed through Google. For online and online-only venues this is a critical service that alerts potential users to information that they might otherwise overlook. A new journal or website can rely on Google to do its advertising.

And Historians

Since this journal is primarily a venue for historians and computing (along with archivists and information specialists), it is worth thinking about the effects that such a digital library will have on academic historians, on their students, and on historical scholarship in general.

There are a variety of ways in which this project will enhance the ability of historians to use information. Naturally, not everything on the World Wide Web is particularly relevant to historians, and this project in no way gives access to unique items that are stored in worldwide archives. Nonetheless, for those volumes that do have relevance to historians, and that are included in this project, there will be great benefits in digitization.

Academic historians are notoriously short on travel dollars. The World Wide Web and digital scanning have great potential for making materials available to scholars at their desks. This is access to content and it will be greatly enhanced by making worldwide collections available online. It will be possible to look at the contents of a volume and decide if it contains critical information for the researcher.

In the past one could always order a book through interlibrary loan if it appeared to be pertinent to one's research. This took time and cost the university library a fairly steep fee. In some cases the book was too old to travel. Now scholars will be able to look at contents and see if it is worth spending dollars to travel to the book or if interlibrary loan will yield a really important monograph.

Students of history, who need to read history in order to study it, will probably find this resource of even greater value than their professors. They have even less money and less travel time. They are also less likely to require that they hold the actual book in their hands and will be happy for a searchable digital equivalent for the work.

A critically important aspect of this digitization project is that book text will be searchable in a way that was not the case before. Scholars will be able to look for instances of particular words, phrases or patterns of writing in individual texts, not through time-consuming reading and highlighting, but through content-analysis software (e.g., Atlas.ti or Nudist) that streamlines the process. In digitizing these materials Google will change the way that they can be accessed at the word-by-word level.

The Downside and the Foreseeable Future

This is an extraordinary partnership on the part of Google and these five libraries. It would appear that it is a win-win situation in which all stakeholders will benefit from the project. There have been some negative reactions nonetheless. Barbara Quint's Information Today article of December 27, 2004 is an excellent summary of issues surrounding the Google Library.  [12] Michael Gorman, dean of library services at California State University, Fresno, and president-elect of the American Library Association, had little good to say about the project in his December 20, 2004 Newsday article.  [13] Among other things, Gorman questions our desire to read a 500 page book online, or our desire to print out 500 unbound pages. He states that context is part of knowledge so that a snippet from page 250 will not be sufficient if one is interested in a particular topic. (In fact he believes that dictionaries and encyclopedias are good online candidates exactly because they lend themselves to short bits of information.) Finally, and this appeals to the librarian in me, he wonders about how these volumes will be indexed. Will one put in a search term and end up with so many hits that the really crucial volume will be lost in the wash of all the others? In the end Gorman and I agree that libraries are not going away. We also agree that scholars will need to consult books in their paper form for the foreseeable future. Historians will benefit from these digitized sources insofar as they can solve important information problems with them. For the rest of the time it will probably be business as usual.


Dates for each citation are for the copyright or issue date for that particular item. All items were accessed and live on 10 January 2005.

1. "Benchmark," American Heritage Dictionary, 4th ed., 2000.

2. "Google Checks Out Library Books," (2004),

3. Aaron Swartz, "Google Weblog," December 19, 2004, Readers are advised that these numbers were presented on this web site with links to the five library sites and their press releases. In the case of Stanford, the press release stated, "digitizing hundreds of thousands, perhaps millions, of books from the shelves of Stanford libraries and making them available to readers worldwide and without charge." Swarts lists the following press release web sites;

4. "FAQ: The University's Pilot Project with Google," (December 13, 2004), . One of Google's websites states, "No library books were harmed during the making of these digital copies." (2004),

5. See the press releases of the five participating libraries for statements on their negotiations and agreements with Google. Oxford, in particular, goes into the details of the "Oxford-Google Digitisation Agreement" at

6. "Google Print and Publishers," (2004),

7. "Stephen King offers serial novel over Internet," (July 24, 2000),

8. "Google: Be a Little Evil," Technology Review 108 (1: January 2005): 15-16 notes that Google collected 1.67 billion dollars from investors on its first day of public offering.

9. "Investor Relations," (2005),

10. Charles H. Ferguson. "What's Next for Google?" Technology Review 108 (1: January 2005): 38-46.

11. "Google' Partnership with Libraries," (2004),

12. Barbara Quint. "Googles' Library Project: Questions, Questions, Questions," (December 27, 2004),

13. Michael Gorman. "Google Library Plan: A Miss Not a Hit," Newsday (December 20, 2004),