Economics and Usage of Digital Libraries: Byting the BulletSkip other details (including permanent urls, DOI, citation information)
This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact firstname.lastname@example.org for more information. :
For more information, read Michigan Publishing's access and usage policy.
20. Economics and Usage of a Corporate Digital Library
This paper analyses the usage of the journals and books available at the BT Digital Library, looking in detail at the usage by the 3,500 people at BT's development site at Adastral Park, near Ipswich, and the impact of this usage on purchase decisions in the library.
British Telecommunications (BT) is a leading provider of telecommunications services. Its main products and services are providing local, long distance, and international calls; providing telephone lines, equipment, and private circuits for homes and businesses, providing and managing private networks, and supplying mobile communications services.
BT's library is organisationally located in a unit called Advanced Communications Engineering (ACE). ACE's 3,500 people mainly work in software and telecoms engineering. ACE develops advanced communication technologies for the companies across the BT Group and for selected other businesses.
ACE is headquartered at Adastral Park, Martlesham, in the East of England. It's the centre of technical expertise for the BT Group and works with the company's businesses worldwide to help them to deliver new products and services to their customers and to build infrastructure for their future. Its reputation was established with BT's pioneering work in the field of optical communications.
While the majority of ACE's people are based at Adastral Park, significant numbers are located in offices worldwide, including locations in Asia, continental Europe, and North America. They include many who are in the forefront of their specialist fields, leading the development of standards and new technologies in areas including multimedia, IP and data networks, mobile communications, network design and management, and business applications.
Like other companies in high-tech sectors, BT experiences major pressures on costs and products, requiring a challenging combination of cost-cutting and innovation to maintain competitiveness. These pressures have led to a change of direction in BT's research focus, moving from research in the optics area towards software and internet engineering.
20.2 The BT Library
Until the mid-1990s, BT's library had a large collection to meet the needs of these researchers, with more than 800 titles, 23 staff, and accommodation totalling 450 square meters. The library provided a full range of services including document delivery from its own collection, inter-library lending, and carrying out online searches for its users.
The pressures on BT have fed through to the library. Costs were increasing at a time when pressure to reduce overheads was insurmountable. The library's user community was changing as BT shifted its research efforts away from pure science into more directed research. As they moved into these new areas, users became much less inclined to visit the physical library, preferring the information they needed to be delivered to them.
In 1994, the library's management team realised it could no longer pursue the path of looking for incremental budget cuts and savings. Radical change was needed. Benefiting from an extensive study of the usage of library material and building on increasing confidence in the in-house enhancements to library automation systems, the library chose to fundamentally rethink the way it provided services to its users.
The library collection was cut to a core 250 journals that were heavily used by people visiting the library in person. Accommodation and staff have been trimmed by 67%, with library staff being redeployed elsewhere in the company. Attention was focussed on establishing a digital library that provided users with access to content over the network through online journals backed up by commercial document delivery.
The provision of loans and photocopies was outsourcedwith the development of the BLADES system. BLADES accepts and validates user requests, tracks the progress of requests and provides status reports for users and input for billing systems for those users who pay for requests (Broadmeadow, 1997). User requests are transmitted semi-automtically to the British Library for fulfilment, with the BL delivering photocopies and loans directly to the user. The substantial savings in journal subscriptions more than offset the cost of commercial document delivery. Part of the savings were due to a communications campaign that highlighted the real cost to BT of requesting a photocopy, thus reducing demand.
The BT Library provides over 800 online journals to its users, either loading them onto its own server or linking through to the publishers' or aggregators' server.The Inspec and ABI/Inform databases act as gateways to these journals, using software developed by BT's knowledge management research team for searching, current awareness and collaboration features. The databases are used to provide end-user searching and browsing, with the provision for users to save searches for selective dissemination of information (SDI) as the databases are updated each week. A table of contents service is also provided.
In the physical library, the librarian could readily see users when they were floundering in their search for information and discretely offer assistance. In the digital library, users are relatively invisible to the librarian. The Digital Library is developing methods for understanding its users, behavior more effectively. Studies of user behavior are intended to highlight the server's problem areas, which then can be redesigned to make them easier to use, and to develop ways of automatically profiling users' interests and work areas. The unspoken purpose of this analysis is also to develop a compelling case showing how effectively the library supports BT's business processes.
Although no detailed study of the usage of the Digital Library has been undertaken, it is important to understand how users access the collection, what problems they find there, and what their usage patterns are. Data about where users are in the organisation are important because of the need to allocate costs and charges.
The Digital Library studies the library server's log files and uses a number of channels to encourage user feedback. The difficulties of log file analysis are well documented (Wright, 1999). The log files are huge and processing them can be time-consuming. Unless user authentication is required, logs record only machine addresses and not personal identifiers. Each server transaction is logged, so a user retrieving a page with five graphics is recorded in six lines in the log file. Users share machines at cybercafes, operate behind proxies, or they use dynamic IP addresses, so that the IP address cannot readily be tied to a single user. Although Wright is describing the problems of log file analysis for servers on the internet, the Digital Library faces the same challenges. BT's intranet is large and proxies and firewalls are installed between different portions of the network. Users share machines in public spaces or borrow colleagues'PCs. Dynamic allocation of IP addresses is used within the intranet to increase flexibility.
A number of packages are available to assist in log file analysis (Busch, 1997). These packages report statistics such as the total hits on the server, the number of Not Modifieds (304's), Redirects (302's), Not Founds (404's), Server Errors (500's), the number of unique URL's served, the number of unique client hosts accessing the server, the total kilobytes transferred, the top one second, one minute, and one hour periods, the most commonly accessed URL's, and the top 5 client hosts accessing server. These deliver a higher level of management information than the librarian needs and are not used in the BT Library
Wright describes techniques for grouping unidentified readers into "constituencies", based on their usage patterns (Wright, 1999). These constituencies, such as robot checkers, users checking the What's New page, new users, or demonstrators, are identified by analysing the server's log files and can then be used to observe navigation of the site and spot usability problems.
The BT Digital Library adapted Wright's ideas in its own analysis of its log files. The purpose of the analysis is
to understand who is using the Digital Library,
to track individual usage of the library, to enable personalisation and collaborative filtering,
to understand which library resources are being used and the extent of that usage to inform renewal decisions,
to track which material was being requested through the document delivery system to suggest additions to the collection,
to ensure that usage of material is within the licenses agreed upon with publishers.
Perl scripts are used to extract meaningful usage data from the server's log, concentrating on the html pages and pdf files read and the server's cgi-scripts run, and ignoring less meaningful traces of usage. Accesses from robot checkers and from the library's own staff were excluded in the analysis. Weekly reports are prepared detailing
the number of distinct IP addresses accessing the server (as a proxy for the number of individual users),
the number of users who logged into the server,
the number of searches done in each of the library databases,
the number of individuals searching each of the library databases,
the total number of online journal articles read from the library server,
the number of readers of the library's collection of online books,
the number of articles read from each of the online journals purchased through individual subscriptions,
the number of users accessing their SDI pages,
the number of users who subscribe to journals' tables of contents and the number of journals which have subscriptions to their tables of contents,
the number of users annotating database records and the number of annotations made.
20.4 Qualitative feedback
The Digital Library strenuously encourages user feedback, although it has not yet carried out formal user surveys. Less formal methods of obtaining feedback are used, such as user meetings and publicity events, e-mailing users when they have had their password reset, and user feedback links on the server.
20.5 Usage of the Digital Library
The BT Library offers 800 online journals to its user community, within the limits of the publishers' agreements. The collection of online journals is supplemented by a disintermediated document delivery service providing what might be called near-online journals. Articles not available online on the Library's server can be requested in a straightforward way and are usually delivered in two to three days.
A noticeable impact of the move from the physical to the digital library is in the distribution of the user base. In 1994 the library served the research community almost solely. In spite of current awareness bulletins, which were distributed throughout the company, and a document delivery service to supplement this, approximately 90% of the library's usage was from the Adastral Park site. By the end of 1998, when ACM, IEE, and IEEE journals became available on the server in addition to the in-house journals and a selection of titles from Elsevier, this figure had gone down to 61%. In 1999 the Library's collection was enhanced with the addition of material from ABI/Inform. Since then, the balance has shifted so that only 40% of users come from the BT's Adastral Park site.
As part of this study, the usage of the 3,500 potential users at Adastral Park was examined. These users are readily traceable, because they use relatively static IP addressing, allowing a more detailed study of individual usage.
In 1999, 1,091 users from BT ACE read 9,108 journal articles from the digital library 12,919 times. (These figures exclude journals from the ACM Digital Library and from the selection of other journals available only on the publishers' sites, where usage data is not available.) In comparison, the library had 1,500 users registered for access to the physical library and lent fewer than 8,000 documents in the same period.
The IEEE's IEL product offers IEE and IEEE journals and conference proceedings in Adobe Acrobat format. These are received monthly and loaded onto the library server, so that data on usage are available for study. In the BT implementation, user registration is not necessary to access these publications, so user analysis is limited to studying access by IP address. Data on usage are available from November 1998, which is too short a period to do more than speculate on seasonal variations beyond noticing the obvious drop in readership at the Christmas/New Year period.
A mean of 39 users read IEL papers each week, with a minimum of 7 and a maximum of 56. These users read 11 papers each, with a minimum of 4 and a maximum of 38.
The IEL collection offers a typical usage pattern, with 80% of journal usage concentrated on 21% of the titles in the package.
ABI/Inform offers a wide range of management and trade journals. Some of these, such as Harvard Business Review or trade journals in the telecoms area are of key interest to the library's user base. ABI/Inform has been fully available through the Digital Library for two months, which restricts the possibility for usage analysis.
During this limited period, a mean of 97 users a week have read ABI/Inform papers, with a minimum of 25 and a maximum of 214. These people have read 3 papers each, with a range from 2 to 6 papers read per user during this limited period.
The Digital Library holds a collection of more than 20 Elsevier titles online. Unlike ABI, IEL, and ABI/Inform which offer a package of publications, the Elsevier collection is based on the set of journals the library took in paper form. These were selected as being core journals, based on library usage and on the library's understanding of BT's research interests. Because access to these journals can be more easily recorded than that to those in paper format, usage patterns can be tracked more easily, advising the librarian which titles are no longer appropriate to hold in the library's collection. In addition, the library's document delivery system records the publishers of documents requested, allowing the librarian to monitor new titles for inclusion. Three titles are heavily used, but even these are not accessed at all in 30% of the weeks. BT's recent decision to stop research in the speech processing area is reflected by a sharp drop in accesses to journals in these areas. In between these two extremes are the majority of journals that are used sporadically. In spite of the additional data on how frequently these journals are used, collection management is still difficult because the masses of data now available make it challenging to extract meaningful information.
The Library has made a limited start to providing online books to its user base. Twenty-four computing books from O'Reilly on Perl, Java, Unix, and networking were made available in 1999. 87 users a week access one of the O'Reilly books online, with the number of users ranging from 8 to 179 in any one week. These books, serving as reference books for problem-solving as well as textbooks, are ideal for online publishing. They have certainly produced the most positive unsolicited feedback.
The corporate librarian is pressed to demonstrate his or her value to the parent organisation, leading to efforts to reduce costs and improve library usage. In BT's case, these pressures resulted in outsourcing labour-intensive activities, such as document delivery, and in replacing paper-based publications with online versions. Susan Rosenblatt is reported as commenting that "available information drives patterns of usage" (Odlyzko, 1997a). BT's experience bears this out. Making more information available on the intranet increases library usage both by local users choosing online access in preference to using the library in person and by remote users who previously had no practical means of access.