|Author:||Deborah Lines Andersen|
|Title:||Benchmarks: Testing the Persistence of URLs|
|Publication Info:||Ann Arbor, MI: MPublishing, University of Michigan Library
This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact email@example.com for more information.
Benchmarks: Testing the Persistence of URLs
Deborah Lines Andersen
vol. 10, no. 1, February 2007
Benchmarks: Testing the Persistence of URLs
Benchmark: a standard by which something can be measured or judged. 
Before moving into the body of this column I would like to acknowledge a change in our editorial staff. With this issue Scott Merriman steps down as the editor of the electronic resources column. I would like to thank Scott for his six years of working on the journal as a column editor. Happily, Scott will continue in a different capacity as a member of the peer-review editorial team of the journal.
With Scott's stepping down I am pleased to welcome Jeremy Boggs as the new editor of the electronic resources column. Jeremy is at George Masson University where he is a PhD student in U.S. History. He is a web developer for the Center for History and New Media there. His column this issues focuses on weblogs and carnivals in history.
A Research Question for the Journal
The subject of this "Benchmarks" surfaced as I was reviewing the papers and columns for this issue. I was particularly taken with the number of links to web sites that appeared in the papers. Luc Guay's article, "Les TIC transforment les pratiques pédagogiques" contains 9 URLs in both its French and English forms. Jessica Lacher-Feldman's article, "Publishers' Bindings Online, 1815-1930" contains 30. Links are no surprise in an online journal, but a question surfaced for me about how up-to-date and viable are the links throughout the entire run of journal issues starting in 1998. This turns out to be an important topic in history and computing. Historically, we want our readers to be able to following the original links that our authors included in their papers. The links can be followed because individuals all over the world have taken the time to migrate their materials to new computer platforms while also preserving the original URLs for access. This is a computing issue.
Shakers and the World Wide Web
In order to explore this question I chose a paper I wrote for the journal in August 1999 entitled, "Heuristics for Educational Use and Evaluation of Electronic Information: A Case of Searching for Shaker History on the World Wide Web."  I selected this article first because I know its material, second because it is now seven years old, and finally because it contained 41 links to other materials on the World Wide Web. My research question was two-fold. First, how many of the links would still be functional? Second, if there were now dead links, would it be possible to find the same information through a keyword search of Google?
The larger question is the historical one-one of access. How well is the journal doing in this regard?
The 41 links can be divided into various application types. Table 1 presents these categories. Most astounding when looking at this table is that nearly 50 percent of the links are dead on location. This means that they are either no longer formatted as links (black rather than blue with no hyperlink function) or blue hyperlinks that point to a "page not found" statement. This does not necessarily mean that the sites no longer exist. A case in point is the University at Albany library site listed as http://www.albany.edu/library/ in the paper. The Albany library is now located at http://library.albany.edu/.
A second case in point is the http://www.ziplink.net/~pcb/and/nat00121.htm link in the paper which presents in black. If one copies this url and clicks on it the Priscilla C. Butler paper, Origins of the Shakers: The Heresy of Mother Ann Lee. (Vassar College Department of Religion, Senior Thesis, Fall Semester, 1982) is still there but one has to jump through a few digital hoops to get there.
A third case is exemplified by http://www.convergemag.com in which the magazine Converge Online continues to exist but the paper in question (Jamie Murphy, 1999. "Technology Training for Faculty." Converge 2(3)30-31) is no longer accessible on the website.
Next is the situation in which the link is extant-it points to the same title as the original url-but the materials have been updated and are not the same as the ones that were originally referenced. An example of this problem exists at http://www.vuw.ac.nz/~agsmith/evaln/evaln.htm, Alastair Smith's "Evaluation of Information Resources," (The World Wide Web Virtual Library, updated 19 October 2006). The material is similar to that referenced in August 1999 but definitely not exactly the same sources due to the update.
There is the case of a URL which exists but which points to something entirely different that the original link. http://www.namss.org.uk/evaluate.htm used to be an evaluation site but as of 15 January 2007 was a site promoting distance education in the United Kingdom.
Finally, there are links that continue to return the content that was there in 1999. The Norwegian site, "Bombs and Babies: Childhood Memories from World War 2" is an example of a site that is extant and has not changed since originally referenced.
|Table 1: Links Presented in "Heuristics..." article, JAHC 2(2): August 1999|
|Number||Live Links||Dead Links|
|Magazines and journals||3||1||2|
|.Com, .Net and .Org sites||21||16||5|
*Items accessed on 15 January 2007
It is important to take a look at the major categories in the above list. Of the U.S. university library links, none of them continue to be available to our readers. This is surprising and disappointing. Universities do upgrade their servers on a regular basis. Students and faculty move on and take their materials off line. Researchers update their materials and create new URLs for them. Nonetheless, the function of references in a journal article is to make those references available to readers. The six university links are no longer available in their present configurations.
The other major category, .Com, .Net, and .Org sites, presents a rosier picture for our readers. Full 75 percent (16 of the 21 links) remain live seven years after their original citations. Perhaps museums keep their old servers longer, or add to rather than change materials.
For the curious, the following text lists all 41 of the web sites that were referenced on the "Heuristics..." paper and gives a comment in brackets about the availability of the particular site. They are organized by the categories in table 1.
All library sites cited in the online paper presented as dead links. There are two issues here. One is that the university library still exists-its does in every case. The second issue is whether or not the cited materials continue to exist on the web site.
- http://www.albany.edu/library/ [dead link; link name has changed]
- http://www.albany.edu/library/internet/#search [dead link-item not found on site]
- http://www.nypl.org/research/chss/grd/resguides/shaker.html [dead on paper site but a pointer to http://www.nypl.org/research/chss/grd/resguides/shaker/ presents the materials, "Shakers and Shakerism: A Guide to the Collections of the New York Public Library"]
- http://www.dayton.lib.oh.us/~ads_elli/shakers.htm [dead link-item not found on site]
- http://thorplus.lib.purdue.edu/~techman/evaluate.htm [dead link-server unavailable]
- http://thorplus.lib.purdue.edu/rese...lasses/gs175/3gs175/evaluation.htm [dead link]
The single item here is a live link but has been updated since the 1999 citation in the original paper.
- http://www.vuw.ac.nz/~agsmith/evaln/evaln.htm [extant, updated 19 Oct 2006]
All of these items are dead links. It is possible that they exist but one would have to do a web search in order to check their availability. As in the case of the University at Albany libraries it is possible that the information exists but that the domain name has changed.
- http://www.hudson.edu/hms.comp/evalweb/ [dead link]
- http://weber.u.washington.edu/~lib560/NETEVAL/index.html [dead link]
- http://wwwsc.library.unh.edu/specoll/exhibits/chrchill [dead link]
- http://www.albany.edu/sisp/student.html [dead link; individual student website no longer extant]
This is an extant link. The information is dated 1997.
- www.hist.uib.no/bomb/ [active site-same materials on oral history and bibliography]
AltaVista used but no link in paper [www.altavista.com exists on the wesb].
- http://www.electricmonk.com/ [now http://electricmonk.seeq.com - SEEQ]
- http://www.directhit.com/ [dead link; a Google search shows that this is now an engine that "measures what people are clicking on from the results at major search services from across the web. It can also determine how long people are spending at the sites they visit."]
Magazines and Journals
Of the two journals the JAHC article is available. Converge Online is also available but the 1999 articles do not appear when using its internal search engine. A Google search of the articles gives their authors and titles but no digital copies of the articles.
- http://www.convergemag.com [two instances; journal exists but cannot find article with "search."]
- http://mcel.pacificu.edu/jahc/jahcI1/Anderson/Anderson.HTML [live]
.Com or .Org or .net sites
Perhaps not surprisingly, only seven of the 21 commercial sites are no longer available to our readers. The museums and businesses referenced in 1999 continue to provide links to materials at the same site, under the same URL.
- http://www.shakerworkshops.com/19th_sm.htm [live: "I don't want to be remembered as a chair"]
- http://www.shakerworkshops.com/dirindex.htm [live: Shaker Workshops Online Catalog]
- http://www.hancockshakervillage.org/ [live: Hancock Shaker Village website]
- http://www.hancockshakervillage.org/old/shakers.html [live: "About the Shakers"]
- http://www.hancockshakervillage.org/brdsidlg.html [live: "the American Shakers"-broadside]
- http://www.shakervillageky.org/ [live: Shaker Village of Pleasant Hill, KY]
- http://www.logantele.com/~shakmus/index.htm [live but with pointer to new address of www.shakermuseum.com for Shaker Museum at South Union, KY]
- http://www.logantele.com/~shakmus/othersites.htm [live although main URL has changed-see above-the other sites are still referenced on this old page]
- http://www.useit.com/papers/webwriting/writing.html [live article: "How to Write for the Web," by Morkes and Nielsen ]
- http://www.crisny.org/not-for-profit/shakerwv/ [live: Shaker Heritage Society]
- http://www.llrx.com/columns/quality.htm [live: "Publishers Wanted, No Experience Necessary: Information Quality on the Web" by Genie Tyburski]
- http://www.shaker.lib.me.us/ [live; Sabbathday Lake Shaker Village]
- http://www.passtheword.org/SHAKER-MANUSCRIPTS/ [live: "Shaker Manuscripts On-line" last updated November 26, 2006]
- http://www.passtheword.org/SHAKER-MANUSCRIPTS/Shakers-Compendium/compndm.htm [live: "Shakers: Compendium," 1859]
- http://www.namss.org.uk/evaluate.htm [now points to a distance education site]
- http://www.regionnet.com/colberk.shakervillage.html [dead link]
- http://www.shakers.org/index.shtml [dead link]
- http://www.valley.net/~esm/ [dead link to a site with search engine]
- http://www.shakers.org/history.html [dead link]
- http://www.logantele.com/~shakmus/journal5a.htm [dead link; item not on site]
- http://www.logantele.com/~shakmus/journal5.htm [dead link; item not on site]
These two sites have URLs that are hard to decipher. Nonetheless, they both are still live and point to the materials that were referenced in 1999. This again underlines the point that large institution tend to reuse and update their sites while individuals seem more likely to maintain original content.
- http://www.ziplink.net/~pcb/and/nat00121.htm [dead on journal site but paper exists: Butler, "Origins of the Shakers"]
- http://www.enter.net/~schunsbe/shaker.html [live link to "A Brief History of the Shaker Oval Box"]
Thoughts and Lessons for the Journal
There are enormous policy and editorial issues that surface as a result of these findings. First, it is apparent that some URLs in the journal lose their status as live links over time-perhaps because of migrating the journal to new servers. If the editorial board were willing to spend time and effort, it might be possible to re-edit and correct these omissions on a regular basis.
Nonetheless, there are many other links that no longer point at the materials intended by the authors. What does the journal do with these? We could require that only links to online, peer-reviewed articles be included on our site, assuming that these articles will have long lives and be available to our readers. This is one end of the editorial policy spectrum. The other end of the spectrum would acknowledge that the World Wide Web is an extremely fluid environment, with information changing on a regular basis. We might want a disclaimer at the beginning of each issue that states that links will come and go. We might do a "link survey" on a regular basis (and on a grander scale that this one-article example) that would see how we are doing.
At present it is business as usual. Authors will use the best sources they can find-print or digital-and we will have to beg the forgiveness of our readers if these materials drift out of our range.
1. "Benchmark," American Heritage Dictionary, 4th ed., 2000.
2. Deborah Lines Andersen. "Heuristics for Educational Use and Evaluation of Electronic Information: A Case of Searching for Shaker History on the World Wide Web." Journal of the Association for History and Computing 2(2); August 1999. http://www.mcel.pacificu.edu/jahc/jahcii2/articlesii2/anderson2/anderson2.html