spobooks5621225.0001.001 in

    13.3 The ReDIF metadata

    From the material that we have covered in the previous section, we can draw a simple organizational model of RePEc as:

    Many archives ⇒ One dataset ⇒ Many services

    Let us turn from the organization of RePEc to its contents. RePEc is about more than the description of resources. It is probably best to say that RePEc is a relational database about economics as a discipline.

    One possible interpretation of the term "discipline" is given by Karlsson and Krichel (1999). They have come up with a model of a discipline as consisting of four elements arranged in a table:

    resource collection
    person institution

    A few words may help to understand that table. A "resource" is any output of academic activity: a research document, a dataset, a computer program, or anything else that an academic person would claim authorship for. A "collection" is a logical grouping of resources. For example, one collection might be comprised of all articles that have undergone the peer review process. A "person" is a physical person; a person may also be a corporate body acting as a physical person in the context of RePEc.

    These data collectively form a relational database describing not only the papers, but also the authors who write them, the institutions where the authors work, and so on. All this data is encoded in the ReDIF metadata format, as illustrated in the following examples.

    A closer look at the contents

    To understand the basics of ReDIF it is best to start with an example. Here is a piece of ReDIF data at http://www.econ.surrey.ac.uk/discussion_papers/RePEC/sur/surrec/surrec9601.pdf:[2]

    Template-Type: ReDIF-Paper 1.0
    Title: Dynamic Aspect of Growth and Fiscal Policy
    Author-Name: Thomas Krichel
    Author-Person: RePEc:per:1965-06-05:thomas_krichel
    Author-Email: T.Krichel@surrey.ac.uk
    Author-Name: Paul Levine
    Author-Email: P.Levine@surrey.ac.uk
    Author-WorkPlace-Name: University of Surrey
    Classification-JEL: C61; E21; E23; E62; O41
    File-URL: ftp://www.econ.surrey.ac.uk/pub/ RePEc/sur/surrec/surrec9601.pdf
    File-Format: application/pdf
    Creation-Date: 199603
    Revision-Date: 199711
    Handle: RePEc:sur:surrec:9601

    When we look at this record, the ReDIF data resembles a standard bibliographical format, with authors, title etc.. The only thing that appears a bit mysterious here is the "Author-Person" field. This field quotes a handle that is known to RePEc. This handle leads to a record maintained at a RePEc handle server.[3]

    Template-Type: ReDIF-Person 1.0
    Name-Full: KRICHEL, THOMAS
    Name-First: THOMAS
    Name-Last: KRICHEL
    Postal: 1 Martyr Court
    10 Martyr Road
    Guildford GU1 4LF
    England
    Email: t.krichel@surrey.ac.uk
    Homepage: http://openlib.org/home/krichel
    Workplace-Institution: RePEc:edi:desuruk
    Author-Paper: RePEc:sur:surrec:9801
    Author-Paper: RePEc:sur:surrec:9702
    Author-Paper: RePEc:sur:surrec:9601
    Author-Paper: RePEc:rpc:rdfdoc:concepts
    Author-Paper: RePEc:rpc:rdfdoc:ReDIF
    Handle: RePEc:per:1965-06-05:THOMAS_KRICHEL

    In this record, we have the handles of documents that the person has written. This record will allow user services to list the complete papers by a given author. This is obviously useful when we want to find papers that one particular author has written. It is also useful to have a central record of the person's contact details. This eliminates the need to update the relevant data elements on every document record. In fact the record on the paper template may be considered as the historical record that is valid at the time when the paper was written, but the address in the person template is the one that is currently valid.

    In the person template, we find another RePEc identifier in the "Workplace-Institution" field. This points to a record that describes the institution, stored at another RePEc handle server.

    Template-Type: ReDIF-Institution 1.0
    Primary-Name: University of Surrey
    Primary-Location: Guildford
    Secondary-Name: Department of Economics
    Secondary-Phone: (01483) 259380
    Secondary-Email: economics@surrey.ac.uk
    Secondary-Fax: (01483) 259548
    Secondary-Postal: Guildford, Surrey GU2 5XH
    Secondary-Homepage: http://www.econ.surrey.ac.uk/
    Handle: RePEc:edi:desuruk

    This information in this record is self-explanatory. Less apparent is the origin of these records.

    Institutional registration

    The registration of institutions is accomplished through the Economics Departments, Institutions and Research Centers (EDIRC) project, compiled by Christian Zimmermann, an Associate Professor of Economics at Unversité du Québec à Montréal on his own account, as a public service to the economics profession. The initial intention was to compile a directory of all economics departments that have a web presence. Many departments that have a web presence now; about 5,000 of them are registered at the time of this writing. All these records are included in RePEc. For each institution, data on its homepage is available, as well as postal and telephone information. For some, there is even data on the main area of work. Thus it is possible to find a list of institutions where—for example—a lot of work in labor economics in being done. At the moment, EDIRC is mainly linked to the rest of the RePEc data through the HoPEc[4] personal registration service. Other links are possible, but are rarely used.

    Personal registration

    HoPEc has a different organization from EDIRC. It is impossible for a single academic to register all persons who are active in Economics. One possible approach would be to ask archives to register people who work at the related institution. This will make archive maintainers' work more complicated, but the overall maintenance effort will be smaller once all current authors are registered. However, authors move between archives, and many have work that appears in different archives. To date, there is no satisfactory way to deal with moving authors. For this reason, the author registration is carried out using a centralized system.

    A person who is registered with HoPEc is identified by a string that is usually close to the person's name and by a date that is significant to the registrant. HoPEc suggests the birth date but any other date will do as long as the person can remember it. When registrants work with the service, they first supply such personal information as the name, the URL of the registrant's homepage, and the email address. Registrants are free to enter data about their academic interests—using the Journal of Economic Literature Classification Scheme—and the EDIRC handle of their primary affiliation.

    When the registrant has entered this data, the second step is to create associations between the record of the registrant and the document data that is contained in RePEc. The most common association is the authorship of a paper; however, other associations are possible, for example the editorship of a series. The registration service then looks up the name of the registrant in the RePEc document database. The registrant can then decide which potential associations are relevant. Because authentication methods are weak, HoPEc relies on honesty.

    There are several significant problems that a service like HoPEc faces. First, since there is no historical precedent for such a service, it is not easy to communicate the raison d'être of the service to a potential registrant. Some people think that they need to register in order to use RePEc services. While this delivers data about who is interested in using RePEc services—and to whom we have been unsucessful to communicate that these services are free—it clutters the database with records of limited usefulness. Last but by no means least, there are all kinds of privacy issues involved in the composition of such a dataset.

    To summarize, HoPEc provides information about a person's identity, affiliation and research interests, and links these data with resource descriptions in RePEc. This allows the identification of a person and the maintainance of related metadata in a timely and cost-efficient way. These data could fruitfully be employed for other purposes, such as maintaining membership data for scholarly societies or lists of conference participants.