    17.4 Selection Criteria

    Since it is generally accepted that it will not be possible to digitize all journals that have ever been published, an important question for any digitization project is how to select the retrospective content to be made available electronically. In JSTOR a variety of factors are taken into consideration in the selection process, including surveys of faculty and library professionals in the field in question, library subscription levels, citation impact factor measures, and length of the run, among other things.

    Looking at JSTOR usage at the article level, it is evident that citations should not be used as the sole factor in determining what content should be digitized. To test the question of whether citation or citation frequency correlates with database usage, we conducted a preliminary analysis on use of particular articles in JSTOR. First, we identified the top ten most frequently used articles for each of the 117 journals in the database. We then looked up their citation data using ISI Social Science Citations. What we found was that usage and citation data were not correlated. For the purpose of illustrating the point, Table 17.3 displays an abbreviated version of the data we collected. Shown below are the top three articles in terms of JSTOR use since 1997 (through March 20, 2000) for three Economics titles. The number of citations to each article in the period from 1997 to 1999 is displayed,[7] as are the average number of citations to each article for the period from 1972 through 1999.

    Table 17.3: JSTOR Usage — Economics Cluster
    Journal Title Number of Times Cited Average cites/year JSTOR views Year of Publication
    American Economic Review
    Article 1 79 24.1 1,670 1968
    Article 2 77 15.7 1,232 1945
    Article 3 181 35.9 1,316 1981
    Quarterly Journal of Economics
    Article 1 175 32.4 2,426 1970
    Article 2 104 26.6 2,400 1992
    Article 3 216 50.9 1,583 1991
    Journal of Political Economy
    Article 1 4 0.5 1,895 1973
    Article 2 8 21.1 1,480 1990
    Article 3 93 17.2 1,258 1983

    Citations do not appear to provide anything like a complete picture of the potential usefulness of a journal article. The most notable example of this point is the number one article for the Journal of Political Economy. Even though this 1973 article has rarely been cited (4 times between 1997 and 1999) and only an average of .5 times per year between 1972 and 1999, it has emerged as the most often-used article from that journal. This article has been viewed 1,895 times and printed 1,402 times during the period that it has been accessible in JSTOR. What this example reveals is not only that citation data may not be the most useful measure for determining what should be digitized, but also that citations focus on what might be called the "reference" or "documentation" value of an article, not its usefulness defined more broadly. Articles with four citations may end up, for a variety of reasons, being the most used. Or, alternatively, highly cited articles may not be used very often at all. This is a factor to keep in mind when selecting content for digitization initiatives.