    Towards an Internet-based scholarly dissemination system

    The Internet is a cost-effective means for scholarly dissemination. Many economics researchers and their institutions have established web sites. However, they are not alone in offering pages on the Web. The Web has grown to an extent that the standard Internet search engines only cover a fraction of the Web, and that fraction is decreasing over time (Lawrence and Giles, 1999). Since much of economics research uses common terms such as "growth", "investment" or "money", a subject search on the entire Web is likely to yield an enormous number of hits. There is no practical way to find which pages contain economics research. Due to this low signal-to-noise ratio, the Web per se does not provide an efficient mechanism for scholarly dissemination. An additional classifying scheme is required to segregate references to materials of interest to the economics profession.

    The most important type of material relevant to scholarly dissemination are research papers. One way to organize this type of material has been demonstrated by the arXiv.org preprint archive, founded in 1991 by Paul Ginsparg of the Los Alamos National Laboratory, with an initial subject area in high energy physics. Authors use that archive to upload papers that are stored there. ArXiv.org has now assembled over 150,000 papers, covering a broad subject range of mathematics, physics and computer science, but concentrating on the original subject area. An attempt has been made to emulate the arXiv.org system in economics with the "Economics Working Paper Archive" (EconWPA) based at Washington University in St. Louis, but success has been limited. There are a number of potential reasons:

    • Economists do not issue preprints as individuals; rather, economics departments and research organizations issue working papers.

    • Economists use a wider variety of document formatting tools than physicists. This reduces the functionality of online archiving and makes it more difficult to construct a good archive.

    • Generally, economists are not known for sophisticated practices in computer literacy and are more likely to encounter significant problems with uploading procedures.

    • There is considerable confusion as to the implications of networked pre-publication on a centralized, high-visibility system for the publication in journals.

    • Economics research is not confined to university departments and research institutes. There are a number of government bodies—central banks, statistical institutes, and others—which contribute a significant amount of research in the field. These bodies, by virtue of their size, have more rigid organizational structures. This makes the coordination required for the central dissemination of research more difficult.

    An ideal system should combine the decentralized nature of the Web, the centralized nature of the arXiv.org archive, and a zero price to end users. I discuss these three requirements in turn.

    The system must have decentralized storage of documents. To illustrate, let us consider the alternative scenario. This would be one where all documents within a certain scope, say within a discipline, would be held on one centralized system. Such a system would not be ideal for three reasons. First, those authors who are rejected by that system would have no alternative publication venue. Since Economics is a contested discipline, this is not ideal. Second, the storage and description of documents is costly. The centralized system may levy a charge on contributors to cover its cost. However, since it enjoys a monopoly, it is likely to use this position to extract rent from authors. This would not be ideal.

    On the other hand, we need access points to the documents for both usage of the documents by end users, as well as for the monitoring of this usage. These activities are best conducted when a centralized document storage is availble, such as the one that arXiv.org affords. Otherwise the economics paperes become lost in the complete contents of the web and their usage is recorded in the web logs of many servers. Such usage logs are private to the manangement of the web servers. They can not be used to monitor usage.

    To explain why the end-user access to the dissemination system should be free, it is useful to refer to Harnad's distinction between trade authors and esoteric authors (1995a). Authors of academic documents are esoteric authors rather than trade authors. They do not expect payments for the written work; instead, they are chiefly interested in reaching an audience of other esoteric authors, and to a lesser extent, the public at large. Therefore the authors are interested in wide dissemination. If a tollgate to the dissemination system is established, then the system will fall short of ideal.

    Having established the three criteria for an ideal system, let me turn to the problem of implementing it. The first and third objectives could be accomplished if departments and research centers allow public access to their documents on the Internet. But for the second, we need a library to hold an organized catalog. The library would collect what is known as "metadata": data about documents that are available using Internet protocols. There is no incentive for any single institution to bear the cost of establishing a comprehensive metadata collection, without external subsidy. However, since every institution will benefit from participation in such an effort, we may solve this incentive problem by creating a virtual collection via a network of linked metadata archives. This network is open in the sense that persons and organizations can join by contributing data about their work. It is also open in the sense that user services can be created from it. This double openness promotes a positive feedback effect. The larger the collection's usage, the more effective it is as a dissemination tool, thus encouraging more authors and their institutions to join, as participation is open. The larger the collection, the more useful it becomes for researchers, which leads to even more usage.

    Bringing a system to such a scale is a difficult challenge. Change in the area of scholarly communication has been slow, because academic careers are directly dependent on its results. scholarly communication. Change is most likely to be driven from within. Therefore, scholarly dissemination system on the Internet is more likely to succeed if it enhances current practice, without a threat to replace it. In the past, The distribution of informal research papers has been based on institutions issuing working papers. These are circulated through exchange arrangements. RePEc is a way to organize this process on the Internet.