    2.7 Growth in usage of electronic information

    2.7 Growth in usage of electronic information

    It is hard to measure online activity accurately. The earliest and still widely used measure is that of "hits," or requests for a file. Unfortunately, with the growth of complicated pages, that measure is harder to evaluate. When possible, I prefer to look at full article downloads. Finally, as a conservative measure, one can look at the number of hosts (unique IP addresses) that requested information from a server. Even then, there are considerable uncertainties. The same person may send requests from several hosts. On the other hand, common employment of proxies and caches means that many people may hide behind a single host address, and a single download may lead to multiple users obtaining copies (as happens when papers are forwarded via email as well).

    In addition to the uncertainties in interpreting the activity seen at a server, it is hard to compare data from different servers. Logs are set to record different things, and some Web pages are much more complicated than others that have the same or equivalent content. Thus comparing different measures of online activity is of necessity like comparing apples, oranges, pears, bananas, and onions. Some of the difficulties of such comparisons can be avoided by concentrating on rates of growth. If online information access is growing much faster than usage of print material, it will eventually dominate.

    In spite of problems inherent in measuring online activity, it is obvious by most measures that Internet is growing rapidly. Typical growth rates, whether of bytes of traffic on backbones or of hosts, are on the order of 100% per year (Odlyzko, 2000; Coffman and Odlyzko, 1998). When one looks at usage of scholarly information online, typical growth rates are in the 50 to 100% range. For example, Table 2.1 shows the utilization of the online resources of the Library of Congress. Growth, in terms of bytes transmitted was over 100% per year for three years before decreasing to 90% in 1998, and then decreasing further in 1999, to 38%. It then increased to 62% in 2000. Table 2.2 shows downloads from the AT&T Labs - Research Web site, at http://www.research.att.com/, which contains a variety of papers, software, data, and other technical information. The growth rate there in the number of requests has been around 50% per year for several years, but between 2000 and 2001, it jumped to over 120%.

    Table 2.1: Library of Congress electronic resource usage statistics.
    month GB requests (millions)
    Feb. 1995 14.0 1.1
    Feb. 1996 31.2 3.9
    Feb. 1997 109.4 15.1
    Feb. 1998 282.0 36.0
    Feb. 1999 535.0 48.6
    Feb. 2000 741.1 61.3
    Feb. 2001 1202.6 86.7
    NOTE: For each month, shows total volume of material sent out that month, in gigabytes, and the number of requests.
    Table 2.2: AT&T Labs - Research external Web server statistics.
    month requests hosts
    Jan. 1997 542,644 17,866*
    Jan. 1998 754,477 35,943
    Jan. 1999 1,204,664 67,191
    Jan. 2000 1,843,319 100,077
    Jan. 2001 4,190,362 178,923
    NOTE: Excludes most crawler activity.
    *Number of hosts for Jan. 1997 is an estimate.

    Some measures of electronic information usage are showing signs of stability, or even decreasing growth. For example, Table 2.3 shows utilization of Leslie Lamport's page devoted to material about a logic for specifying and reasoning about concurrent and reactive systems.[12] Usage had been pretty stable in 1996 through 1998. When I corresponded with him about this in 1999, he thought usage had reached a steady state, with the entire community interested in this esoteric technical subject already accessing the page as much as they would ever need to do. However, the final counts for 1999 and 2000 showed substantial increases.

    Table 2.3: Visits to Leslie Lamport's Temporal Logic of Actions Web page.
    year visits hosts
    1996 18,800 5,300
    1997 19,000 5,600
    1998 18,400 5,300
    1999 31,100 8,000
    2000 33,500 8,000
    NOTE: approximate counts

    The next few sections discuss data about several online information sources that are freely available on the Internet.