Excerpted: Electronic Theses and Dissertations: Digitizing Scholarship for Its Own SakeSkip other details (including permanent urls, DOI, citation information)
This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact firstname.lastname@example.org for more information. :
For more information, read Michigan Publishing's access and usage policy.
For most scholars, the graduate thesis or dissertation is the first major work of scholarship they produce. To make those works more readily available to other scholars, as well as to save money, many universities and libraries are now making digitized (or electronic) versions available. Following the lead of Virginia Tech, some institutions, moreover, are even beginning to require that all graduate theses and dissertations be submitted and published electronically; some even go so far as to completely eliminate printed copies.
Electronic theses and dissertations, or ETDs, are defined as those theses and dissertations submitted, archived, or accessed primarily in electronic formats. That includes traditional word-processed (or typewritten and scanned) documents made available in Print Document Format (PDF), as well as less-traditional hypertext and multimedia formats published electronically on CD-ROM or on the World Wide Web. In this paper we briefly look at the move toward making theses and dissertations available in electronic formats and discuss some of the proposals that have been advanced for dealing with problems of production, storage, and dissemination of those works.
The Move to ETDs
Many libraries are now in the process of digitizing information in an effort to preserve it and to make it more widely available. The Library of Congress's National Digital Library Project plans to digitize five million items by 2000, and many university, public, and private libraries worldwide are currently working on digitizing their collections as well. The Networked Digital Library of Theses and Dissertations (NDLTD), funded by a grant from the U.S. Department of Education, is a collection focused specifically on digitized versions of theses, dissertations, and technical papers that began in 1996 at Virginia Tech. The NDLTD reports that more than twenty universities around the world have become official contributing members of the Initiative within just the past year, and nearly twice that number have expressed interest or are taking steps to participate.
Traditional methods of archiving and storing theses and dissertations are inefficient and unwieldy. Many theses and dissertations lie moldering in library basements, with no efficient way for researchers to locate the information that may be contained in them. Further, the time and costs involved in procuring copies of those works may often be prohibitive. One effort to make those works more readily available has been to register them with University Microfilms International (UMI), a commercial service that, according to its Web site (1997), "publishes and archives dissertations and theses; sells copies on demand; and maintains the definitive bibliographic record for over 1.4 million doctoral dissertations and master's theses." Today, however, most theses and dissertations are available from UMI only on paper, microfiche, or microfilm, at prices ranging from $29.50 to $69.50, depending on the format. Copies ordered through UMI's Web site using a credit card arrive in no fewer than four days. Archiving theses and dissertations electronically can help to alleviate some of the problems involved in storage, and making full-text versions available either on the Web or as e-mail attachments would make access almost immediate. Electronic versions on disk, CD-ROM, or other digital electronic media could be cheaper as well. Universities may opt to publish theses and dissertations produced at their own institutions on the World Wide Web, or individual scholars might publish their own works on the Web, thus allowing free access to full texts. Even if libraries decide not to offer the text of the ETD free, they could help scholars by allowing full-text searching. That would allow researchers to be sure that the documents they order or download actually contain the information they seek.
"Many theses and dissertations lie mouldering in library basements, with no efficient way for researchers to locate the information that may be contained in them."
Writing with new technologies has already become more than just plain text; many writers are beginning to take advantage of the flexibility offered by new technology to include multimedia elements such as hypertext links, video and audio, and interactive elements in their electronic publications. And many students want the freedom to experiment with these new forms. One particularly innovative example of a work that would have been impossible without new electronic technologies is the University of Arkansas Online Writing Center, designed by Paula Puffer in fulfillment of her master's thesis requirement in English. That writing center allows students worldwide to submit writing assignments electronically to tutors who will evaluate and critique them online.
Although some schools have allowed students to produce theses and dissertations in non-traditional forms, most still require that students submit them on paper as well, thus either precluding the use of multimedia or hypertextual elements altogether, or else forcing students to, in effect, produce two entirely different works for two entirely different media. As Daniel Eisenberg, assistant to the dean for information technology at Northern Arizona University, noted in a listserv discussion on ETDS,
I had a student who wanted to do a digital edition of a text for a thesis but the university insisted on a paper copy of the digital edition. . . . Almost every dissertation now, in the U.S., is done in some type of digital format. These disks are erased and recycled, or sit in the hands of the new Ph.D. This is a waste of a resource that future generations may well take us to task for. . . . (Eisenberg 1997)
Some scholars fear that as new definitions of "writing" are developed, writers may be seduced into using new technologies to produce documents that offer more sensory appeal than substance. Most ETDs, however, are still produced for print; multimedia applications are used only to enhance or supplement the arguments. One reason is that graduate departments and committees remain fairly conservative so even those scholars who take advantage of electronic formats, including multimedia, hypertext (or hypermedia), and/or virtual reality components, produce theses and dissertations that will be approved by their committees and graduate departments. Simply allowing or requiring that theses and dissertations be submitted electronically, therefore, is unlikely to effect radical change in the substance or the form of those works in the near future.
As electronic technologies develop, however, reading and writing may also change in as-yet-unforeseeable ways. We don't know how much new technologies will change our conception of scholarship. Only by allowing graduate students and their committees the flexibility to experiment with new forms, and by developing guidelines that can sustain change, will we find out.
More Problems to Consider
There are, of course, problems as well as opportunities in allowing or requiring electronic submission, distribution, and archiving of theses and dissertations, including problems of access and distribution, archiving and storage, and copyright and publication issues.
Access and Distribution
In order to access ETDs, people need a computer. And, since reading lengthy works online is still a formidable task unless you have very good eyes, access to high-speed printers (or plenty of time) may also be necessary. Printing large documents, moreover, can still be a costly venture. The cost of paper and toner can add up, and paying a commercial print shop to print out an ETD file can be as costly as (or perhaps more costly than) ordering a printed copy through UMI. Further, ETDs that make heavy use of audio and video may require faster processing speeds and expensive software or hardware. To serve those without Internet access, it may be necessary for librarians or archivists to produce copies on disk or CD-ROM, which could require purchase of high-speed drives to facilitate duplication. Mailing disks or CD-ROM copies, too, takes time, the same amount of time now required to send paper copies. And interactive ETDs (such as the Online Writing Center) cannot easily be converted to paper without subverting the very nature of the author's intentions.
"The move from paper to electronic versions of theses and dissertations has been possible only through expenditures of time and/or money on the part of library and information sciences programs."
Currently, in order to access the full text of a print thesis or dissertation, researchers need to procure it from the library of the university where it was produced, either in person or through interlibrary loan. Some schools, however, do not participate in interlibrary loan, forcing some researchers to travel great distances to access those scholarly works. Where dissertations and theses are archived by UMI, researchers can buy them in print, microfiche, or microfilm formats for a fee.
Electronic publication of theses and dissertations can make access and distribution faster and less expensive for most scholars. NDLTD, for example, makes theses and dissertations available free on the Web, and many libraries and universities offer computer access to the World Wide Web. Most universities also provide printing. As projects like the National Digital Library Project make more information available online, it is likely that the trend toward providing faster, cheaper, and easier public access to ETDs will continue.
ETDs can help to make information more readily available to scholars and researchers by allowing quicker and more thorough search capabilities. For example, the University of Virginia has begun testing and adapting a distributed storage and retrieval system [formerly http://www.ncstrl.org/Dienst/htdocs/Info/ncstrl.html] developed by Jim Davis of Xerox and Carl Lagoze of Cornell University that will allow researchers to use the Web to browse the entire Networked Digital Library of Theses and Dissertations by author, subject, keyword, department, or year of publication. That system will also allow users to search and retrieve chapters or sections of a thesis or dissertation to home in on specific sections that are of interest to them. The NDLTD continues to add to its current base of both participating schools and documents, with more than five hundred selections now available at Virginia Tech alone. Many scholars are already taking advantage of the service: During the 1996-97 academic year at Virginia Tech, ETDs were accessed "almost two orders of magnitude more than the number of circulations of the library copy." (Fox et al. 1997).
Archiving and Storage
In all of the universities participating in the ETDs Initiative, libraries are responsible for maintaining accessibility by ensuring that files produced by outmoded or obsolete applications are translated into newer media as necessary. That has resulted in the creation of new positions or added responsibilities for many library staff members and administrators. The move from paper to electronic versions of theses and dissertations has been possible only through expenditures of time and money on the part of library and information sciences programs. That need for more resources, however, is not unique to ETDs. Libraries are devoting resources to digitizing all kinds of information, not only ETDs, but traditionally published works as well. While some fear that all the work being done to digitize information will be lost with the next major change in technology, in fact, software publishers in recent years have been careful to assure that newer versions of software usually accommodate files produced by older versions. Thus it is not likely that changes in technology will make ETDs inaccessible.
ETDs are easily backed up, so the risk of losing information is minimal. ETDs stored electronically are less likely to be damaged than their print counterparts, since they have no physical form to yellow and decay with age, and since loaning out a copy does not include relinquishing the original. And advances in technology have made possible increases in electronic storage capacity (such as advances in file compression technology and the availability of larger hard drives) that substantially lower costs. The storage potential of libraries may increase exponentially. IBM recently donated a server with four terabytes of hierarchical storage (or 40,000 gigabytes) to the Virginia Tech pilot project on ETDs, "enough for about 40 million average-sized ETDs" (Fox et al. 1997). That one server could accommodate all existing theses and dissertations worldwide in just a fraction of its memory capacity.
Copyright and Publication
Publishers are concerned about the relationship of ETDs to other forms of publication. Often a dissertation becomes the basis for a scholar's first book. While most of those works are considerably revised for publication, some are published with relatively few changes. Even though paper theses and dissertations are available, most academic presses are not as concerned that they represent prior publication, probably because of the barriers of time, distance, and cost. However, the prospect of having full texts available on the World Wide Web, given that the market for scholarly books is very small, may worry some publishers. On the other hand, greater access might be seen as a way to induce readers to preview a book. According to a recent issue of The Chronicle of Higher Education (Winkler, 1997), some academic publishers consider online publication to be "great advertising": "For each of our electronic books, we've approximately doubled our sales," says Marney Smyth, electronic-productions editor of the MIT Press. "The plain fact is that no one is going to sit there and read a whole book on line. And it costs money and time to download it."
The National Academy Press has already put nearly 2,000 of its books online, and has found that the electronic publication of some books has boosted sales of paper copies often by as much as two to three times previous levels.
Universities in other countries are also looking at copyright as they move to put theses and dissertations in electronic formats and publish them online. Kerstin Olofsson, head of the Teacher Education Library at Umea universitetsbibliotek in Sweden, writes,
[T]he copyright issues are the most complicated part of the project. I guess you have the same problem in the States, that the author of the thesis or dissertation also sells the rights to it to a commercial publisher as well. So you would have to negotiate with every publisher for each commercially published thesis. Then you have the problem with the other type of dissertation, mostly in science and medicine, which usually are made up of articles already given to scholarly journals. (Olofsson 1997)
To address potential conflicts over copyright, Virginia Tech has established a system where access to an ETD can be delayed temporarily to allow an article or book to be the only source of the author's material. Holding back electronic publication of an ETD can allow the paper publication to come out first. Too, access to an ETD from outside the author's university can be blocked, ensuring the economic incentives required by many publishers (Fox et al. 1997). Those solutions, while not entirely foolproof, nonetheless offer a protection for both authors and publishers concerned with the risks of electronic access to ETDs. Another concern is the use of copyrighted material in an ETD. Scholars had sometimes included graphics and other copyrighted material in their theses and dissertations without acquiring permissions unless the work was accepted for commercial publication. If ETDs are published on the Web, however, authors will need to ensure compliance with copyright law and fair-use guidelines. That may include acquiring permission to use copyrighted material, which can sometimes be costly. Although UMI and other services have long made theses and dissertations available to the public, the access was limited enough that inclusion of copyrighted materials did not seem to have been an issue in most cases. However, copyright issues and fair-use guidelines are being debated hotly in light of the explosion of electronic publishing. (See, for example, the list of pending copyright legislation at http://www.copyright.gov/legislation). ETD authors must consider the impact of that debate on their ability to use copyrighted materials.
"Most students already prepare their theses or dissertations electronically, using computers and word-processing software."
Formats for ETDs
Most students already prepare their theses or dissertations electronically, using computers and word-processing software. Formats proposed to make ETDs easily viewable through different platforms include the use of PDF (or Print Document Format) files created with Adobe Distiller. That software creates an exact, digitized picture of a document, page by page, including any graphics and fonts. The file can then be downloaded and viewed using the Adobe Reader, available free on the Web. PDF documents retain all formatting and graphics and also allow the author to include links to other sites on the Web or annotations within the article. In addition, Adobe files can be indexed easily and searched by keywords specifically chosen by the author or indexer. PDF documents available on the Web may also be searchable using words or phrases found anywhere within the document, thus greatly facilitating a researcher's task. For larger and more complex documents, Virginia Tech encourages submission of ETDs in LaTeX or TeX format. LaTeX and TeX are device-independent document formatting systems that use PostScript fonts. They are particularly useful for formatting complex mathematical equations in electronic documents. Those files are then converted to PDF format.
ETD-ML, or Electronic Thesis and Dissertation Markup Language, is another format used by Virginia Tech for submission of ETDs. ETD-ML is actually a form of SGML, or Standard Generalized Markup Language, which uses tags to embed formatting codes in a document. Hypertext Markup Language (HTML), the language of Web documents, is a variety of SGML that uses similar specialized tags. SGML allows for the "exchange of information at any level of complexity among software, hardware, storage and presentation systems (including database management and publishing applications) without regard to the manufacturer's name on the label" (SoftQuad). That makes it portable across platforms. The strength of SGML for publishers is its use of document-type definitions, or DTDs, which allow a publisher to specify exactly what the document will look like by defining tags for that particular document type. The Electronic Text Center at the University of Virginia already makes available thousands of text files marked up with SGML, which are automatically converted to HTML when accessed by the user so they can be read online with a Web browser such as Netscape.
ETDs are still placed with UMI, giving researchers the option of accessing them in print or microform. While interactive ETDs cannot easily be reproduced with print and microform, those options ensure access to many important works by a wide variety of researchers. UMI now makes electronic versions of journals and other publications available online and via other electronic means, and it is likely that the company will one day provide full-text copies of ETDs as well.
Like print, electronic publication has limitations:
- PDF files can be created and read only with Adobe software. If the Adobe reader becomes the standard for publication of ETDs, there is no guarantee that Adobe will continue to offer it free. That could mean that either scholars will again have to face cost prohibitions or that libraries will have to pursue other means of making ETDs freely available.
- Learning to use markup languages such as SGML adds additional layers of complexity to the already complex task of producing scholarship.
- Access to technology is still limited, limiting the availability of ETDs to some scholars.
- The costs of gearing up, including the costs of training scholars, researchers, and staff to implement the ETD initiative, are substantial.
- Theses such as Paula Puffer's are interactive and cannot be accessed without a computer, modem, and Internet connection. Many scholars may want to include interactive components, CGI-scripting elements (such as HyperNews forums), external links, virtual-reality components (i.e., live MOO conversations), and other elements that are becoming common on Web sites, elements that are not easily archived because they may change with each "reading."
- Access to hardware and software, access to telephone connections, and knowledge of protocols can limit access to important information if it is available only online.
While electronic publication can make works more accessible to students, researchers, and others who lack the time, search capabilities, finances, or other resources to locate them in traditional print formats, the system works only if they have access to the necessary computer resources and know-how.
As we move beyond thinking of scholarship as print-based, however, we need to consider how we make our works of scholarship available. We need to consider how we can foster scholarship that is innovative as well as substantive.
Print forms also have limitations: they cannot include multimedia elements, they cannot include interactive elements, and accessing them through interlibrary loans or repositories such as UMI can be time consuming, expensive, and limiting. Just as the invention of the printing press wrought changes in how scholarship was produced and disseminated, technological innovations are having an impact on our conceptions of reading, writing, research, and publication. Electronic theses and dissertations are only one small part of the move to make information available through electronic means to as wide an audience as possible, and to allow scholars to continue to do what they have always done: participate in the creation of knowledge.
Christian Weisser teaches professional writing, computer-assisted composition, and computer-assisted technical writing at the University of South Florida (USF). He is currently co-editing a book on Electronic Theses and Dissertations. He is a member of the USF Task Force on ETDs, and has delivered presentations on computers and writing at several international conferences. Christian has authored or co-authored several publications in the field of Rhetoric and Composition.
Janice R. Walker is the Coordinator for the Computers and Writing Program at the University of South Florida where she teaches courses in Composition, Professional Writing, Technical Writing, and Expository Writing in the multimedia computer classroom. She has authored several articles on electronic research and documentation, scholarly publishing, and copyright issues. Her book, The Columbia Guide to Online Style (co-authored with Todd Taylor), is scheduled for release in 1998. She is currently working on an article on copyright issues for an upcoming special issue of Computers and Composition Journal.
Eisenberg, Daniel. 1997. "Re: Electronic Dissertations." Digital Libraries Research Mailing List. DIGLIB@INFOSERV.NLC-BNC.CA. (23 January). Also at [formerly http://www.cas.usf.edu/english/walker/papers/etds/etdsmail.html] (1 November).
"ETD Initiative Homepage." 1997. http://etd.vt.edu/ (10 October).
Fox, Edward A., et al. 1996. "National Digital Library of Theses and Dissertations: A Scalable and Sustainable Approach to Unlock University Resources." D-Lib Magazine (September). Also at http://www.dlib.org/dlib/september96/theses/09fox.html (18 July 1997).
Fox, Edward A., et al. 1997. "Networked Digital Library of Theses and Dissertations: An International Effort Unlocking University Resources." D-Lib Magazine (September). Also at http://www.dlib.org/dlib/september97/theses/09fox.html (10 October).
Kipp, Neill A. 1997. "Document Type Definition for Electronic Theses and Dissertations. http://etd.vt.edu/etd/etd-ml/dtdetds.htm (1 November).
Library of Congress. 1997. "National Digital Library Program." http://rs6.loc.gov/ammem/dli2/html/lcndlp.html (25 October).
National Digital Library of Theses and Dissertations. 1997. "NDLTD Project." http://www.ndltd.org/index.htm (1 November).
Networked Computer Science Technical Reports Library. 1997. "A Brief Description of NCSTRL." [formerly http://www.ncstrl.org/Dienst/htdocs/Info/ncstrl.html] (25 October).
Oloffson, Kerstin. "Citing (off-list)." 1997. 24 January 1997. http://www.cas.usf.edu/english/walker/papers/etds/etdsmail.html (27 October).
Puffer, Paula. "Electronic Thesis." 1997. http://ww w.cas.usf.edu/english/walker/papers/etds/etdsmail.html (27 October).
Savage, William. "1986-96 Publishing Stats." Personal e-mail. (21 October 1997).
SoftQuad, Inc. "The SGML Primer." 1995. http://www.sq.com/sgmlinfo/primbody.html (9 November 1997) [Editor's note: Link removed August 2001 because article no longer exists.]
University Microfilms International. 1997. "UMI's Online Dissertation Services." http://www.umi.com/hp/Products/Dissertations.html (9 November). Winkler, Karen J. 1997 "Academic Presses Look to the Internet to Save Scholarly Monographs." 1997. The Chronicle of Higher Education (12 September): A18.