Add to bookbag
Authors : Laura Micham, David Faulds
Title: Making British Heritage Available on the World Wide Web: The State of Digitization in Special Collections Librarianship in Great Britain
Publication Info: Ann Arbor, MI: MPublishing, University of Michigan Library
November 1999
Availability:

This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact mpub-help@umich.edu for more information.

Source: Making British Heritage Available on the World Wide Web: The State of Digitization in Special Collections Librarianship in Great Britain
Laura Micham, David Faulds


vol. 2, no. 3, November 1999
Article Type: Article
URL: http://hdl.handle.net/2027/spo.3310410.0002.303
PDF: Download full PDF [61kb ]

Making British Heritage Available on the World Wide Web: The State of Digitization in Special Collections Librarianship in Great Britain

Laura Micham

lmicham@emory.edu

David Faulds

david.faulds@yale.edu

Digitization of historical materials by librarians has become an essential part of providing access to these special collections in university libraries in the United States. Irrespective of the type of collection, the number of staff, or the size of the budget, American special collections librarians have, as a group, put a high priority on establishing a presence on the World Wide Web. That web presence often includes digital facsimiles of primary source materials, thus changing the way that historical research can be done. This has not been the case for many libraries in British universities. This paper aims to explore ways in which digitization has been adopted by special collections librarians in Great Britain (England, Scotland, and Wales) in order to increase awareness and understanding of these new methods of access to materials in British repositories amongst scholars and students who seek them as well as amongst special collections librarians in the U.S. who engage in digitization.

I. Introduction

Great Britain is comprised of three distinct entities, England, Wales, and Scotland, and has existed in this form for over 200 years. Confusion often exists over the difference between Great Britain and the United Kingdom. The United Kingdom is Great Britain and Northern Ireland. This study focused on the state of digitization in Great Britain, specifically on special collections libraries and librarians working in British universities.

Within this framework, British heritage consists of the documentary history of Great Britain that is housed in university libraries. These artifacts range from ancient and medieval illuminated manuscripts to modern photograph collections. Access to these materials is vital for people who study or teach British history. For this reason, special collections librarians are expanding their traditional role as care givers to embrace a new role as purveyors of electronic archives to a global audience.

This new role involves a process called digitization. Digitization can be broken down into three main areas of work. The first is the selection of images and collections to be digitized. Many factors must be considered during this phase such as conservation and preservation of the items involved and relevance to prevailing trends of historical research. The second activity involves the conversion of these items into an electronic form using scanning technology. This activity also requires contextualization, or the association of text with an image that identifies it and explain its significance and connection to the rest. The third activity is delivery. The prevailing mode of delivery for digital archives is the World Wide Web. Delivery on the World Wide Web involves a set of tasks that enable a researcher to access and effectively use a digital archive.

It is clear that British librarians are taking advantage of the Internet to provide substantial visual and interactive information resources, and that the availability of government funding has been a major factor in enabling these important developments.Due to financial and network constraints, British special collections librarians came later to digitization than American practitioners. As a result, British librarians had the opportunity to learn from the mistakes and successes of their American counterparts.

This paper aims to show the process and the products of the British approach to digitization. In order to do this, the authors will explore ways in which digitization has been adopted by special collections librarians working in university libraries in Great Britain. An awareness and a recognition of this process is critical both for American special collections librarians who want to collaborate with their British colleagues and for historians and other researchers who work with the materials "housed" in digital archives in Great Britain.

Almost all funding for education and digitization emanates from the central government. In order to understand the relationship between the government and higher education, the authors will provide an overview of university education in Britain, specifically its organization, funding, and approach to automation. Then, to understand the relationship between higher education and the process of digitization, the nature of library education will be examined as it applies to special collections. With this foundation, a meaningful discussion of the process of digitization in Great Britain is possible. This discussion will include the results of a survey of British university library World Wide Web sites; an exploration of the British approach to common issues involved in digitization including the role of professional organizations and conferences, copyright in an electronic environment, and preservation of digital materials; and, finally, a presentation of case studies of digitization projects to illustrate the way in which British special collections librarians have incorporated this new technology into their work.

II. The British Higher Education System

Great Britain has a little over 100 universities. The seemingly small number of universities in Britain versus the U.S. is easily explained. While many of the institutions in the U.S. are small colleges with small numbers of students, the universities in Great Britain are mostly large in size, relative to the population, each with at least 5,000 students. In addition, in Great Britain colleges do not exist in the American sense as undergraduate-only institutions. All British universities offer post-graduate degrees, although some institutions offer them only in a small number of subject areas.

While in the United States the ideal of a well rounded education requires that an undergraduate science student take humanities classes and vice versa, this concept does not exist in Britain. In short, classes are usually restricted to the title of the degree. For example, a history student will spend three years studying history and little else. Consequently, the population of undergraduate students doing historical research may be slightly smaller in Britain than in the U.S., but the intensity of their educational program results in earlier and more sustained need for primary source materials.

Funding

By far the largest source of money for higher education institutions of Great Britain is from the central government which provides a grant through the higher education funding bodies in England, Wales and Scotland established following the Further and Higher Education Act of 1992. In England alone the government provides £3.87 billion ($6.1 billion) for higher education. [1]

Funding for higher education is also provided by grants from research councils which fund individual research projects and support related postgraduate training. The UK research councils were established by Royal Charter to fulfill objectives set out by the government in the White Paper entitled "Realising our potential" which was released in May of 1993. [2] The scope of the Councils is limited to science and social science. Accordingly, research councils are not generally a funding source for digitization projects involving historical materials.

In addition to the government grants and research council money, British higher education is supported by some private funding, but not nearly to the same extent as in the United States. Most of the private money entering British universities is for scientific or industrial research. One beneficiary is the University of Loughborough, which has very close ties to the automotive industry. The private sector occasionally supports digitization projects such as the Bodleian Library/Toyota City Imaging Project at Oxford that will be discussed in greater detail later in this paper.

The final funding source for universities is tuition fees paid by Local Education Authorities (LEAs) and, in a recent development, by the students themselves. LEAs are organized by county in England and Wales and by region in Scotland. Principally LEAs manage primary and secondary education but, through tuition fees, they contribute to tertiary education as well. [3] The paying of fees by students was introduced for the first time in the 1998/1999 academic year. Full-time students starting university in 1998 had to pay up to £1000 (about $1600) a year towards their tuition fees. Most students, however, are eligible for varying degrees of support towards meeting this expense. The government will automatically meet the remaining cost of running the courses. [4] This means that the government still pays 3/4 of the cost of running the course. The end result is that an undergraduate degree from Oxford University costs approximately 1/3 as much as an undergraduate degree from Harvard.

The 1990's have been a critical decade for education in Great Britain. The British system of funding for higher education is in a transitional period with the establishment of dedicated funding bodies, the introduction of tuition, and, of course, the dramatic growth in and focus on electronic resources including the World Wide Web. Government officials, university administrators, educators, and librarians have been engaged in a process of reshaping higher education to meet the demands and challenges of the information age. The development of information technology in Great Britain is at the core of this process. Information technology, in turn, is central to digitization and the forms that this approach to resource provision has taken.

III. British Higher Education: The Evolution of Information Technology

In order to appreciate how digitization has come about in Great Britain, it is important to understand the way that information technology in general has developed. Even small-scale digitization projects are not possible without an infrastructure to support them. This infrastructure has been and continues to be developed by an advisory committee of the funding councils for England, Scotland and Wales, and the Department of Education for Northern Ireland. The committee is called the JISC, or the Joint Information Systems Committee. [5]

The Joint Information Systems Committee (JISC) was formed in March 1993. The constitution of the JISC stipulates that "there should be a national dimension to the activities, that activities should be technologically based, that the JISC involvement should demonstrably add value beyond that which could be achieved by institutions acting individually or collectively, and that activities could not be performed as well and more appropriately by institutions or by another body." [6]

The JISC currently funds data centers, data services and projects using a number of different delivery modes, which contribute to the electronic content available to the Higher Education community. Most of the material provided has taken the form of bibliographic datasets in a number of disciplines, and statistical and quantitative data in the sciences and social sciences. The JISC also funds electronic journals and the digitization of text, manuscripts and images.

DNER and JANET

The JISC is responsible for the two most important components of the UK's approach to information technology for higher education. The first component is the development of JANET, an acronym for Joint Academic Network, the collection of networking services and facilities which support the communication requirements of the UK education and research community. [7]JANET connects several hundred institutions, including all universities, most colleges of higher education, most research council establishments and other organizations that work in collaboration with the academic and research community. JANET is fully connected to other Internet service providers in the UK and is also connected to the National Research Networks (NRNs) in Europe. It forms part of the global Internet providing access to the U.S. and the rest of the world. As a computer network JANET facilitates the exchange of data and information between connected institutions.

The newest iteration of JANET is called SuperJANET. [8] SuperJANET is the name used for the broadband, or high-speed, part of JANET. The name was coined in 1989 for an initiative aimed at providing an advanced optical fiber broadband network for the higher education community at an affordable price. The SuperJANET project has transformed the JANET network from one primarily handling data to a network capable of simultaneously transporting video and audio as well as data. Consequently, SuperJANET is also vital to the delivery of digital images and other graphics, the key components of digitization.

JANET and SuperJANET support the second major component of information technology in higher education, the DNER, or Distributed National Electronic Resource. [9] The DNER aims to ensure that the Higher Education community will have access to the extended range of electronic information resources which is becoming available in the UK. The national electronic resource will be distributed in the sense that networked information does not need a central repository. The JISC will make content free for members of higher education institutions (HEIs) by means of authentication and security mechanisms. Copyright and licensing will be supported through the creation of broad and inclusive license terms and cost effective national and regional licensing arrangements. The JISC states that it is committed to long-term preservation of the data and its continuing availability to the HE community beyond any contractual relationship with the data supplier. Accordingly, electronic datasets deemed to be of lasting national value to the HE community will be acquired, maintained, migrated, and archived to appropriate guidelines and standards.

Digitization projects in Great Britain are logical elements of the Distributed National Electronic Resource. Accordingly, the JISC and the national funding councils have taken an active interest in digitization as an approach to information delivery for scholarly research. The councils, sometimes through the JISC and sometimes directly, have created a complex network of resources, services, programs, and initiatives to facilitate every aspect of information technology including the complicated process of digitizing primary source materials for the study of history.

This support system is made up of independent groups each with a name, a remit, or well articulated mission and objectives statement, a physical location, a well developed and maintained web presence, a full time professional staff, a budget, and a close connection to its sister groups played out in regular meetings, collaborative projects, and promotion and awareness efforts. These groups are responsible for everything from providing guidance on computer graphics to compiling and preserving subject specific datasets, training instructors to incorporate digitized materials into classroom teaching, outsourcing digitization services, and, for those who do their own digitization projects, a support network in the form of training, discussion lists, resource and information sharing, and consultants. There are almost a dozen groups, programs, or initiatives whose constituencies include special collections librarians and whose work includes facilitating the digitization of primary source materials for historical research.

In effect, all of the ingredients exist for special collections librarians to incorporate digitization into their operations. What sort of information technology training is given to students in library school? What follows is a discussion of library education and its implications for special collections librarianship as a discipline and for the role information technology in the practice of administering special collections.

IV. Special Collections Librarianship

There is variable provision for the education of special collections librarians in Great Britain. A number of library schools offer a qualification in archives and records management but rare books librarianship is not a specific focus of British post-graduate education. Indeed, there is no distinct qualification available to rare book librarians. Accordingly, librarians who work with rare books come from many different areas of librarianship.

It is important to note that when a British librarian or archivist speaks of manuscripts they are most often speaking of materials that were produced before the invention of the printing press. It follows then that, in Great Britain, successful completion of the master's degree in Archives and Records Management requires that each student has some knowledge of medieval Latin. In addition to medieval and diplomatic paleography, the curriculum includes classes that deal with contemporary records management, archival description, management of archives, and the history of the British administrative system. Finally, there is typically one course that deals with information technology. This class covers basic aspects of information technology from email and word processing to HTML, the language used to create World Wide Web pages. Digitization is not yet a part of this curriculum.

In the United States many library science programs run alongside information science in a School of Information and Library Science. While this also occurs in Great Britain, the prescribed nature of the master's degree and its short duration mean that students cannot opt to take IS classes as they can in this country. Clearly this means that students have fewer opportunities to get training in and experience with information technology. Most special collections librarians involved in digitization, therefore, receive their training on the job under the tutelage of skilled colleagues. Others have done self-directed study augmented by continuing education workshops and other training offered through JISC programs. [10] It is clear that all of these librarians have an affinity for and a commitment to information technology and its applications to the work of special collections.

V. Digitization

Digitization of primary source materials is a complex, multi-faceted process that requires significant resources and special skills. Special collections librarians in Great Britain are faced with the same challenges as their counterparts in the U.S.: large processing and cataloging backlogs, small staff numbers, and a general lack of resources to undertake the basic operations of their repositories. In this atmosphere, it's unsurprising that only a small number of well-funded institutions have had the ability to create significant digital archives. Because of the obvious value of having a presence on the World Wide Web and the appeal of enhancing the usual information provided with digitized images of materials from the collections, several special collections departments in Great Britain have incorporated at least a selection of images into their web sites.

Many of these departments have also created larger collections of images available through the World Wide Web. But, only a few of these projects cannot be seen as serious research tools because of the lack of context for the images. If an item, text or graphic, is separated from the record series in which it exists, it loses its context and, by extension some part of its value for research. Therefore, the only digitization projects that can be described as tools for research and not exhibits or outreach tools, are large datasets. These collections of data do not have to be exhaustive, but must be complete enough in and of themselves such that historians and other researchers can draw compelling and viable conclusions based on the data they contain.

The Survey

In order to get an idea of who is doing digitization and what forms it has taken, the authors did a survey of World Wide Web sites associated with special collections departments and centers in every university in Great Britain. To augment this data (Appendix A), the authors conducted onsite interviews with project leaders of three major digitization projects either completed or still in progress.

The survey of web sites sought to establish whether a given university has collections of unpublished primary source materials and if these collections are contained within an administrative structure that can be described as a repository. The survey also resulted in the collection of data about the presence of any digitized images of holdings and the existence of collections or thematic exhibits of digital images. Interviews with project leaders helped to flesh out the picture of the process and products of digitization projects. We were interested in funding sources, staff numbers, digital collection size, either completed or planned, technical aspects of the process including hardware, software, workflow, and methods of storage and delivery, and reactions about the purpose and process of digitization.

Our data lead us to make the following general observations about who is engaging in digitization in Great Britain and for what purpose or purposes. The institutions that have created significant collections of digitized primary sources have funding sources external to the library and, in most cases, the university. These projects are funded by the business sector or more often by the higher education funding councils. The allocation of funding is based on a competitive process similar to the grant system in the U.S. Institutions doing large scale digitization possess collections of international interest and, for the most part, staff with the highest level of training in information technology. Some of the projects have been inter-institutional, taking advantage of a larger and more diverse staff and resulting in a collection of greater value than those contained at any one institution.

The institutions and individuals involved in digitization projects are cooperative not only in the context of inter-institutional projects, but also as they pursue disparate projects. This cooperation exists for several reasons. It exists primarily because Great Britain is a small place. Librarians who work with special collections are a small group. Special collections librarians who do digitization are an even smaller group. It stands to reason that they will encounter each other during the course of their careers and may even work together at some point.

Another reason that digitization comes about in a cooperative environment is that almost every project is funded and facilitated by the central government. As has been described earlier in this paper, every aspect of information technology in higher education emanates from the central government. The success of this approach depends on cooperation and a certain degree of uniformity.

The nature of publications, conferences, and other group activities means that this small population has many opportunities to collaborate and develop theory and practice. What has come out of this pioneering work is a series of large digital archives maintained by universities all over Great Britain. These new digital archives contain materials relevant to many different periods of history and areas of historical research.

Aberdeen University, for example, has digitized the entirety of The Aberdeen Bestiary, an important illuminated manuscript and one of the best of its kind. The project web site contains full-page images and detailed views of illustrations and other significant features that are complemented by a series of commentaries, and a transcription and translation of the original Latin. Librarians at Aberdeen have also digitized the photograph collection of George Washington Wilson, an important Scottish photographer and innovator in the 19th century. [11] These projects were funded as a part of the Higher Education funding councils' Specialised Research Collections in the Humanities initiative.

The Issues

Digitization is more than scanning images for viewing on a computer screen. There are many issues involved and many dimensions to a successful project. The role of professional organizations, the establishment of conferences, the impact of copyright law, and the need for preservation of digital image collections are among the most pressing issues for special collections librarians and their patrons.

The first issue to explore is the role of professional organizations in supporting or resisting the emergence of digitization into the practice of librarianship and archives. The Library Association, the main professional organization for British librarians, has been reluctant to support digitization. In their March, 1998 statement entitled "Archives at the Millenium, Library Association Response," [12] they argue that digitization is not a preservation tool and that the use of digitization diverts money and attention from the important task of preserving documents.

Conversely, The Society of Archivists does encourage the use of information technologies to enhance access to primary source materials. [13] The Society has nine panels that exist to consider what it describes as the "central elements of the profession." One of these, the Information Technology Panel, aims to provide a forum for discussion and to enable Society members involved in the use and development of automated systems and new technologies to keep abreast of new developments in a rapidly changing field.

Conferences also play a vital a role in the success of digitization. In addition to the regular meetings attended by representatives from various projects and organizations involved in IT and digitization in particular, a number of important conferences and workshops have taken place. Conferences often come about as a result of co-sponsorship by the JISC and several smaller organizations and programs involved in information technology. These meetings usually have a theme, designated speakers, and working sessions in which practitioners share their experiences, problems, and solutions. Conferences like these take place almost every month in Britain and are well attended and reviewed.

There are also large international conferences. Several of these conferences, which started in the early 1990's and which at least in part concern themselves with digitization, are now annual events like the Electronic Library and Visual Information Research (ELVIRA) conference, the International Symposium on Networked Learner Support, and the Networked Information in an International Context conference. The collaboration and scholarship that results from these conferences keep the science and the practice of digital librarianship vital and relevant.

Conferences are not the only venues for this sort of exchange of ideas. Almost every new IT project has a formal debut. These events last from one to two days and give project leaders and their staff an opportunity to unveil and demonstrate the "deliverables" of their project. A deliverable can be anything from a subject-specific information gateway to an inter-institutional online public access catalog, a new text search engine or a fully operational digital archive. IT practitioners also attend workshops, symposia, and lecture series all in an effort to learn and teach, and to know and be known. Interviews with participants and reviews of these events indicate that this community is, for the most part, harmonious and highly effective.

Copyright is one of the most complicated issues that any librarian faces. Librarians who create digital archives are faced with even greater complexities. For the purposes of this paper, a brief overview of British copyright policy for digital information will give the reader insight into the differences and similarities to U.S. copyright law.

Copyright law is governed by international treaties, the most important of which are the Berne Convention and the Universal Copyright Convention. These allow for reciprocal protection for nationals from different countries. For example, a US citizen enjoys the same protection in UK copyright law that she would if she were a UK citizen. Thus, the key issue is where the infringement takes place rather than where the material was first created.

Both the U.S. and Britain support an exception to copyright law for educational purposes. What in the U.S. is called "fair use," Britons refer to as "fair dealing." It is defined as "research, private study, reporting current events, criticism or review." [14] Under present UK Copyright law, these exceptions frequently do not extend in practice to works in digital form. Therefore displaying, downloading, copying, transmitting or printing of copyright works in electronic form has to be performed under contract or license. [15] The JISC's study, entitled "Copyright Clearance and Digitisation in UK Higher Education," [16] gives guidelines for complying with copyright law in the context of digital archives. It provides a "code of practice" for seeking permission to publish including determining who the copyright holder is, what to include in a permission request, and how to negotiate a license for usage rights.

Charles Oppenheim, one of Britain's foremost authorities on copyright for libraries and librarians, has interpreted copyright law for digital images, the raw ingredients of digital archives. [17] According to Oppenheim, a digital image, "the data resulting from an image scanned into machine readable form" is considered to be a "Literary Work" under the 1988 UK Copyright Act. Under the latest legislation, copyright lasts 70 years from the date of the creator's death. This is 20 years longer than the original legislation permitted. In addition, copyright law in Britain also allows for "moral rights" of an author. This gives the right of the author of a work to be acknowledged as the author, and not to have his or her work subjected to derogatory treatment. It also gives authors the right to not be falsely attributed with the authorship of a work he or she did not create.

It is clear that British librarians must be vigilant when they select materials for inclusion in digital archives. In order to comply with copyright law and still create large digital archives, librarians are working with copyright experts like Oppenheim to develop methods of charging for access, authentication systems for registered or subscribed users, and digital watermarking of images to prevent reproduction or unauthorized usage. These strategies are still very much in the planning and testing stages and, as such, would be an excellent topic for future research.

The most important issue for any special collections librarian is preservation. For librarians who do digitization it is a critical consideration fraught with questions. The National Preservation Office, the Electronic Libraries Programme, and individual practitioners have dedicated a great deal of time and effort to devise solutions to the problem of preservation in an electronic environment. The National Preservation Office has developed a "digital remit" [18]among other strategies for ensuring the availability of digital data for future generations of researchers. The main purpose of this remit is to assemble guidelines for best practice in digital archiving, to coordinate the development of a national digital preservation policy and implementation of guidelines, and to establish and administer the work of the Digital Archiving Working Group (DAWG) which will advise the NPO Management Committee.

The Electronic Libraries Programme, one of the JISC initiatives established to facilitate the implementation of IT in higher education, is sponsoring a project called CEDARS [19], an acronym within an acronym. It stands for CURL Exemplars in Digital ArchiveS. CURL is short for Consortium of University Research Libraries. The main objective of the project is to address strategic, methodological and practical issues and provide guidance in best practice for digital preservation.

The main deliverables of the project will be recommendations and guidelines as well as practical robust and scaleable models for establishing distributed digital archives. It is expected that the outcomes of CEDARS will influence the development of legislation for legal deposit of electronic materials and feed directly into the emerging national strategy for digital preservation currently being developed through the National Preservation Office.

Digital preservation, "a process by which digital data is preserved in digital form in order to ensure the usability, durability and intellectual integrity of the information contained therein," [20] is the biggest task for builders of digital archives. In order for these datasets to be a worthwhile investment of time and money and remain valuable services to historical researchers, provisions must be made for their continued existence.

Digitization Projects at Oxford University: Case Studies

In order to understand the process of digitization in Great Britain and the people who undertake it, the authors conducted interviews with several project leaders. We found that in one University, the University of Oxford, three major digitization projects are in various stages of completion. These projects are perfect as case studies because they have different funding sources, staffing, workflow, methodology, software and hardware specifications, deliverables, and wildly different subject matter. It is these eight aspects of the projects that we explored with project leaders at Oxford, and which will be the subject of the remainder of this paper. These case studies represent a tangible picture of how the process of digitization has unfolded in Great Britain.

The first digitization project at Oxford, and indeed anywhere in higher education in Great Britain, is the Bodleian Library/Toyota City Imaging Project. [21] This project was funded by Toyota Tsusho Ltd. of Toyota City, Japan. Work on the project commenced in 1994, and was completed in 1997. In all, 6000 images of Motor Car ephemera in the John Johnson Collection, the most important collection of printed ephemera in the United Kingdom, supplemented by 1,000 images of other forms of transport were selected based on their visual appeal, copyright-free status, and the availability of associated text. In all 7000 motoring and transport images covering car advertising, billboards and ephemera were digitized.

The work involved in this project fell to two librarians. The curator of the John Johnson Collection was responsible for selecting and indexing the images while a media librarian with an information science background did all of the digitization and design of the database that holds and serves these images. The process of indexing required assigning subject headings along with other descriptive information into twenty-five different fields in a standard PC database package. Subjects headings used were both "home-based," meaning designed for the collection, and taken from the Thesaurus for Graphic Materials. The work of selection, cataloging, and indexing took about two years. [22]

As this work progressed, so did the digitization phase of the project. The digitization process required the creation of 35mm photographic transparencies by the Bodleian Library's photographic studio followed by outsourcing for the conversion of the transparencies to Kodak's Photo-CD images. The media librarian then converted these images using a software package called "Image Alchemy" to a file format suitable for delivery on the World Wide Web. The Kodak Photo-CD process produces images at several levels of quality from easily browsable thumbnails to high-resolution files. This has practical value as well as long-term value for storage. Preservation copies of the highest quality files were maintained for archival storage and potential conversion to future file formats as appropriate. [23]

In order to ensure the long-term use of the cataloging and indexing data, the database of records created by the curator was marked up using a metadata standard similar to the Dublin Core called TEI, Text Encoding Initiative. TEI allows the linking of a text to non-text objects including digital images. Accordingly, it is an excellent method for creating a digital image system that is non-proprietary, making it easily transferable to other systems as required.

Each item in the collection was assigned a single TEI record which includes cataloging and indexing information about the item as well as contextual information about the collection from which it was selected. In order to display the database on the World Wide Web an extra code was written into the records that converts TEI to HTML, the language in which all web documents are written. The existence of the TEI coding allows the researcher to browse the items in the database by thumbnail image or text, to view images in a range of resolutions, and to perform keyword searches. [24]

The Toyota City project was a success and led the way for other digitization projects at Oxford University. One such project, The Early Manuscripts Project, formerly known as the Celtic Manuscripts Project [25], is a large scale initiative undertaken to preserve fragile illuminated manuscripts in the Bodleian Library and several college libraries at Oxford. The project receives ongoing funding from the higher education funding councils as a part of the Specialised Research Collections in the Humanities initiative. Its goal is the high-resolution digitization of over 80,000 images of ancient papyri, Celtic, and other medieval manuscripts.

The collection of ancient and medieval manuscripts housed at Oxford University is the second largest collection of rare and fragile materials of its kind in the country. Scholarly use of these materials represents a major preservation dilemma for librarians. Because demand for these materials is increasing every year, librarians and preservationists at Oxford have decided that that the creation of "surrogates" is an appropriate solution. Digital surrogates were chosen because they are of a higher quality than microfilm or hard copy facsimiles and because they require less storage space and are more error-proof than their analog counterparts. [26]

Because of the rarity and condition of the collection, staff involved in the project include the curator of the library or collection involved, a conservator, and a preservationist in addition to the project manager and the actual digitizer. If the conservation and preservation staff approve of a manuscript selected by the curator, then it is included in the project. Members of the project have made every effort to create as comprehensive a digital archive as possible while ensuring that each manuscript is left the way is was found. In this way, the digital surrogates can satisfy demands of researchers rather than stimulating more demand than already exists. [27]

Using two high specification digital cameras, a specially designed cradle and light source, each manuscript volume is digitized systematically from beginning to end, capturing all the right side pages (rectos) first, turning the volume in the cradle, and then capturing the left side pages (versos). Page numbers are used to identify the files. Foliation and pagination is done by the curator if page numbers are inaccurate or do not exist in the original. [28]

The resolution of the scanned images ranges from 350 dpi, dots per inch, to 600dpi, resulting in original file sizes of up to 120 megabytes. The uncompressed originals are stored on the university's massive "hierarchical file server" and backed up on two gigabyte cartridges and magnetic tapes. Compressed files are mounted on the World Wide Web in a simple HTML document. The information provided by the web site includes thumbnail images for browsing that are linked to larger images for research. Each image is accompanied by textual information based on print catalogs and the knowledge of the curators. As of May, 1998 1.5 terabytes of digital images alone had been captured. [29]

In addition to the software used to capture the images, the digitizer uses a program designed to compensate for the curvature of the gutters, the parts of the page that come together at the spine. The natural shape of a bound volume is not conducive to the creation of flat and evenly lit pages for scanning. The software client, which was developed in part for this project, solves an important problem in terms of readability of the images and preservation of the volume. Because of this innovative program, the volumes did not have to be opened as widely as they would have. Indeed, without this software plug-in, many tightly bound volumes would have been excluded from the project altogether.

The Early Manuscripts Project has experienced several problems such as recruiting digitizers with the specific set of skills required, achieving the best lighting and calibration, and overcoming the network congestion caused by the production of such a prodigious amount of data. The successes have outweighed the difficulties though. The project participants are optimistic that a union catalog will be created and that an authentication system will be added to protect copyright holders (the libraries) and facilitate a clearer picture of the demographics of visitors. [30]

Another project well underway at Oxford University is the Broadside Ballads Project. [31] This project was also made possible with funding from the Specialised Research Collections in the Humanities initiative. Its goal is to make both indexes and full-text images of the Bodleian Library's Broadside Ballad Collection available to the research community.

Broadside ballads are cheap song prints sold by hawkers. They were a popular form of entertainment and news that influenced popular culture from the 16th to the 20th centuries. As such they are vital to the study of music, art, social and literary history. Broadside ballads have been under-utilized because they are ephemeral publications and because they have been poorly described. The Bodleian Library has over 30,000 examples in several major collections. [32]

Until this project began, there was only partial bibliographic control over the broadsides collections. Handwritten entries existed in two separate indexes, reflecting the fact that many ballads occur in variants on different broadsides, and many broadsides contain more than one ballad. Further, computerized index entries existed for one third of the titles index. Although many of the broadsides contain illustrations, no index of these existed and their subjects and history in general are poorly known. The Broadside Ballads Project was inspired by a need to preserve these materials and to create a better system for access to the originals and their digital surrogates.

The project goals are: to complete the computerization of the titles index, to computerize the shelf list index, to microfilm the entire collection and scan the microfilms, to create digitized images, to index the illustrations on the broadsides, to integrate the various indexes and link them to the digitized images, and to create a user-friendly public-access interface to the whole. The project began in 1996 and envisages completion in 1999. [33]

The project team consists of a project supervisor who is also a full-time administrator for the Bodleian Library, a cataloger/indexer, and an analyst/programmer. The cataloger/indexer is using the ICONCLASS classification scheme that assigns alphanumeric codes to subject headings in four categories: theme, people, items of dress, and setting. This scheme will allow Windows and/or web-based browsing of both the classification strings themselves and the associated keywords. After a broadside is indexed, a standard catalog record is created for it.

The analyst/programmer performs the microfilm scanning in batches, the quality control, the cleaning of images with image manipulation software, and the creation of a web interface for the digital archive. This interface will allow for viewing thumbnails and higher resolution images, searching the image database by keyword, and viewing cataloging information along side of images. The project teams hopes to incorporate sound, transcriptions of the ballads, and commentary as well as multi-media information about tune and woodcut families if funding can be secured for the creation of these value-added features. [34]

Practitioners respond

All of the project leaders interviewed had ideas and opinions about the impact of digitization and the future of the practice. One called for the development of a centralized archive for all data produced as a result of British digitization projects. Another suggested that electronic reference services should be developed that can fully take advantage of these digital archives. The consensus of opinion seemed to be that because of a lack of resources to promote and expand these projects, programs sponsored by the JISC have, in the words of one interviewee, "not really changed the face of UK academic libraries. There are too few projects to do that, and once they're over most seem to have quietly disappeared into the ether." [35]

It is clear that in spite of the challenges past, present, and future of digitization in Great Britain, there have been benefits to practitioners as well as researchers. According to Chris Rusbridge, the Project Director of eLib, the Electronic Libraries Programme, digitization projects and other information technology initiatives "have directly employed several hundred people, and indirectly several hundred more. Many of these people now work in useful positions in museums, universities, colleges, publishers and other organizations, with the skills, contacts and knowledge gained through their [project] days. Many universities are more aware of electronic copyright issues and laws (or at least of the associated pitfalls). And many universities have come to, if not fear, then at least have a little more (sometimes grudging) respect for their 'Library and Information Services.'" [36]

Rusbridge and other leaders and practitioners believe that the success of digitization and other IT projects depends on the transformation of these projects into services: working standards and extensible, adaptable tools for future use. In order to produce high-profile, well-used services there must be dedicated funding, sponsorship, and education grants. Simply put, financial support will lead to sustained staffing, and people are the key to the success of information technology. Therefore, the work of programs like those sponsored by the JISC is far from complete. The force of technological change and paradigm shifting "in the way universities deal with scholarly information and the flows of teaching and research material, require concentrated and organized development work for several years to come." [37]

VI. Conclusion

Digitization represents the convergence of advanced technology and the traditional skills and concerns of the librarian. Creating a digital archive requires the same set of skills as building a traditional archive: selection, preservation, arrangement, description, and provision of access. Advanced technology offers the opportunity to expand the scope of these skills to enter a new paradigm, the global electronic network. The creation of digital archives exemplifies the use of advanced technology by librarians to meet the needs of historical researchers.

British special collections librarians have joined their counterparts in the U.S. as creators of this global electronic network by using the relatively small size of the country and small numbers in their ranks to their advantage. Through national projects, despite a lack of funds, practitioners in Great Britain have been able to focus their resources and achieve broader and more useful digital archives than the piecemeal and ineffective projects that would probably result if the institutions were acting independently. This pooling of resources has also proven an effective way of keeping abreast of global copyright changes, innovations in information technologies, and other crucial issues related to digitization of primary source materials. The effectiveness of this collaborative approach is an indication that regional and even trans-national projects merit a greater emphasis by librarians in the U.S.

It is clear that the use of digitization can greatly expand the potential patron base for special collections and help to justify the high costs of processing collections in both countries. The most important reason for making special collections available in digital formats is that these are the formats in which information is increasingly being created in this century and will be recorded into the future. Future generations of scholars and students will not only be accustomed to electronic media but will come to expect it. This expectation coupled with the inevitability that special collections libraries will be accessioning more and more electronic materials into their collections means that research will depend on these materials. It is the mission of librarians to provide access to this information. In order to fulfill this mission, mastering the art and science of digitization is of primary importance.

Appendix A

Digitization Survey*
Universities in Great Britain
(name of institution)
Special Collections department?
(existence of special collections as a department, center, or other university-sponsored entity)
Special Collections WWW presence?
(existence of a World Wide Web page describing or devoted to SC)
Digitization of any holdings?
(existence of at least one digitized image of holdings on the SC World Wide Web site)
Digital Exhibits?
(existence of collections or thematic exhibits of digital images on the SC World Wide Web site)
University of Aberdeen Yes Yes Yes Yes (2)
Abertay University No No No No
Anglia Polytechnic University Yes Yes No No
Aston University No No No No
The University of Bath Yes Yes No No
Birmingham University Yes Yes No No
Bournemouth University Yes No No No
Bradford University No No No No
University of Brighton No No No No
Bristol University Yes Yes Yes No
Brunel University No No No No
University of Buckingham No No No No
University of Cambridge Yes Yes Yes No
University of Central England No No No No
University of Central Lancashire Yes Yes Yes Yes (1)
City University, Central London No No No No
Coventry University No No No No
Cranfield University No No No No
De Montfort University No No No No
Derby University No No No No
University of Dundee Yes Yes Yes No
Durham University Yes Yes No No
University of East Anglia Yes Yes Yes No
University of East London No No No No
Edinburgh University Yes Yes No No
University of Essex Yes Yes No No
Exeter University Yes Yes No No
Glasgow Caledonian University Yes Yes No No
University of Glamoragn No No No No
Glasgow University Yes Yes Yes Yes (2)
University of Greenwich Yes Yes Yes Yes (1)
London Guildhall University Yes Yes Yes Yes (2)
Heriot Watt University No No No No
University of Hertfordshire No No No No
Huddersfield University Yes Yes No No
University of Hull Yes Yes Yes Yes (2)
Keele University Yes Yes No No
University of Kent Yes Yes Yes Yes (3)
Kingston University No No No No
Lancaster University Yes Yes Yes Yes (1)
University of Leeds Yes Yes No No
Leeds Metropolitan University No No No No
University of Leicester No No No No
University of Lincolnshire and Humberside No No No No
Liverpool University Yes Yes Yes Yes (4)
Liverpool John Moores University No No No No
University of London Birkbeck College No No No No
University of London Goldsmiths College No No No No
University of London Heythrop College Yes No No No
University of London Imperial College Yes Yes No No
University of London King's College Yes Yes No No
University of London Queen Mary and Westfield College Yes No No No
University of London Royal Holloway Yes Yes Yes Yes (1)
University of London University College Yes Yes Yes Yes (1)
University of London Wye College No No No No
Loughborough University No No No No
University of Luton No No No No
Manchester Metropolitan University Yes Yes Yes No
University of Manchester Yes Yes Yes Yes (10)
Middlesex University Yes No No No
Napier University No No No No
Newcastle University Yes Yes Yes Yes (2)
University of Northumbria No No No No
University of North London No No No No
Nottingham University Yes Yes Yes No
Nottingham Trent University No No No No
Oxford University Yes Yes Yes Yes (4)
Oxford Brookes University Yes Yes No No
Paisley University No No No No
Plymouth University No No No No
The University of Portsmouth No No No No
Reading University Yes Yes Yes Yes (2)
Robert Gordon University Yes Yes No No
St.Andrews University Yes Yes No No
University of Salford No No No No
The University of Sheffield Yes Yes No No
Sheffield Hallam University No No No No
University of Southampton Yes Yes No No
South Bank University No No No No
Staffordshire University Yes No No No
Stirling University Yes Yes No No
The University of Strathclyde Yes Yes No No
Sunderland University No No No No
University of Surrey Yes Yes Yes No
Sussex University Yes Yes No No
University of Teesside No No No No
Thames Valley University No No No No
University of Wales Aberystwyth No No No No
University of Wales Bangor Yes Yes No No
University of Wales Cardiff Yes Yes No No
University of Wales Lampeter Yes Yes Yes Yes (1)
University of Wales College Newport No No No No
University of Wales Swansea Yes Yes Yes Yes (1)
University of Wales Institute No No No No
University of Warwick Yes Y No No
University of the West of England No No No No
University of Westminster No No No No
University of Wolverhampton No No No No
University of York Yes Yes No No
Total number of institutions: 99
Total:
Yes: 54
No: 45
% Yes: 54.5
% No: 45.5
Total:
Yes: 49
No: 50
% Yes: 49.5
% No: 50.5
Total:
Yes: 24
No: 75
% Yes: 24.2
% No: 75.8
Total:
Yes: 17
No: 82
% Yes: 17.2
% No: 82.8

Total number of digital collections or exhibits: 35

Average number of projects per institution involved in digitization: 2

*Data gathered 2/99 and again in 9/99 with no changes in the four parameters considered

Notes

1. Higher Education Funding Council for England: Finance, <http://www.hefce.ac.uk/Finance/default.asp>

2. UK Research Councils, <http://www.nerc.ac.uk/research-councils/>

3. The British Council USA: School and Vocational Education Reform, <http://www.britishcouncil-usa.org/usarefm1.htm>

4. Interview with Deborah Eaton, Librarian, St Edmund Hall, University of Oxford, March, 1999.

5. The Joint Information Systems Committee, Networks and Innovative Services for Higher Education, <http://www.jisc.ac.uk/>

6. The Joint Information Systems Committee, JISC - Constitution and Funding Arrangements, <http://www.jisc.ac.uk/jisc/const.html>

7. JANET, The UK Academic and Research Network, <http://www.ja.net/>

8. General Information about JANET & SuperJANET, <http://www.ja.net/general/>

9. Distributed National Electronic Resource, DNER, <http://www.jisc.ac.uk/cei/dner_colpol.html>

10. Interview with Richard Gartner, Pearson New Media Librarian, Bodleian Library, University of Oxford, April, 1999.

11. Special Collections and Archives, University of Aberdeen, Aberdeen, Scotland, <http://www.abdn.ac.uk/library/introduction.html>

12. "Archives at the Millenium, Library Association Response," Library Association, The Association for Librarians and Information Managers, March, 1998, <http://www.la-hq.org.uk/directory/prof_issues/aatm.html>

13. The Society of Archivists, The Society of Archivists Panels, <http://www.archives.org.uk/panels.html>.

14. The Patent Office, Copyright, http://www.patent.gov.uk/dpolicy/index.html.

15. JISC Senior Management Briefing Paper 5: Copyright, <http://www.jisc.ac.uk/pub98/sm05_copyright.html>

16. Joint Information Systems Committee, "Copyright Clearance and Digitisation in UK Higher Education," <http://www.ukoln.ac.uk/services/elib/papers/pa/clearance>

17. Oppenheim, Charles. "Moral Rights and the Electronic Library," Ariadne, The Web Version, Issue 4, July, 1996. <http://www.ariadne.ac.uk/issue4/copyright/intro.html>

18. National Preservation Office (NPO), "Digital Remit," <http://www.bl.uk/services/preservation/digital.html>

19. The CEDARS Project, CURL Exemplars in Digital Archives, <http://www.leeds.ac.uk/cedars/>

20. "Preserving Digital Information," Report of the Task Force on Archiving of Digital Information commissioned by The Commission on Preservation and Access and The Research Libraries Group, 1996.

21. "Bodleian Library/Toyota City Imaging Project," <http://www.bodley.ox.ac.uk/toyota/>

22. Interview, Julie Anne Lambert, Supervisor of the John Johnson Collection, Bodleian Library, Oxford, May, 1998.

23. Interview with Richard Gartner, Pearson New Media Librarian, Bodleian Library, Oxford, May, 1998.

24. Gartner, Richard. 1997. "Conservation by the Numbers Three Years On: An Update on Digital Imaging at Oxford," Microform & Imaging Review,26 (4): 147-151.

25. "Early Manuscripts at Oxford University,"<http://image.ox.ac.uk/>

26. Interview with David Cooper, Project Officer, Early Manuscripts Project, Libraries Automation Service, Oxford University, May, 1998.

27. Ibid

28. Ibid

29. Williams, Jane, "Celtic Manuscripts Project, University of Oxford," unpublished report to the Institute for Learning and Research Technology, May, 1998.

30. Ibid

31. Bodleian Library, University of Oxford Broadside Ballads Project," <http://www.bodley.ox.ac.uk/mh/ballads/>

32. Interview with Michael Heaney, Project Supervisor, Broadside Ballads Project, Oxford University, May, 1998.

33. Ibid

34. Ibid

35. Interview with a British special collections librarian involved in digitization, April, 1999.

36. From an interview with Chris Rusbridge by John MacColl, "Of Arms and the Man We Sing," Ariadne, The Web Version, Issue 18, December, 1998. <http://www.ariadne.ac.uk/issue18/rusbridge/>

37. Ibid

Laura Micham, Senior Archivist for Research Services Special Collections, Robert W. Woodruff Library, Emory University <lmicham@emory.edu> and

David Faulds, Catalog Librarian, Beinecke Library, Yale University david.faulds@yale.edu