spobooks5621225.0001.001 in

    4.4 Economic and experimental design

    The PEAK system built upon existing digital library infrastructure and information retrieval mechanisms described above, but its primary purpose was to serve as a testbed for economic experiments in pricing and bundling information goods. Central to the PEAK experiment were the opportunities that electronic access creates for unbundling and re-bundling scholarly literature. A print-on-paper journal is, in itself, a bundle of issues. Each issue contains a bundle of articles, each of which is again a bundle of bibliographic information, an abstract, references, text, figures and many other elements.[4] In addition, the electronic environment makes possible other new dimensions of product variations. For example, access can be granted for a limited period of time (e.g., day, month, and year) and new services such as hyperlinks can be incorporated as part of the content. Permutations and combinations are almost limitless.[5]

    Choosing what to offer from the different bundling alternatives is not an easy task. In the PEAK experiment, we were constrained by the demands of the experiment and the demands of the customers. Given the limited number of participants, bundle alternatives had to be limited in order to obtain enough experimental variation to support statistical analysis. The products had to be familiar enough to potential users to generate participation and reduce confounding learning effects. The economic models were designed by the University of Michigan research team, then reviewed and approved by a joint advisory board comprised of two senior Elsevier staff members (Karen Hunter and Roland Dietz), the University of Michigan Library Associate Director for Digital Library Initiatives (Wendy Pradt Lougee), and the head of the research team (Professor Jeffrey MacKie-Mason).

    After balancing the different alternatives and constraints, we selected three different bundle types as the products for the experiment. We refer to the product types as traditional subscriptions, generalized subscriptions and single articles (sometimes called the "per-article" model).[6] We now describe these three product offerings.

    Traditional subscription: A user or a library could purchase unlimited access to a set of articles designated as a journal by the publisher for $4/issue if the library already held a print subscription. The value-based logic supporting this model is that the content value is already paid in the print subscription price, so the customer is only charged for an incremental electronic delivery cost. If the institution was not previously subscribed to the paper version, the cost of the traditional subscription was $4 per issue, plus 10% of the paper version subscription price. In this case, the customer is charged for the electronic delivery cost plus a percentage of the paper subscription price to reflect the value of the content. The electronic journals corresponded to the Elsevier print-on-paper journal titles. Access to subscribed content continued throughout the project. The traditional subscription is a "seller-chooses" bundle, in that the seller, through the editorial process, determines which articles are delivered to subscribed users.

    Generalized subscription: An institution (typically with the library acting as agent) could pre-purchase unlimited access to a set of any 120 articles selected by users. These pre-purchases cost $548 for the bundle, which averages about $4.50 per article selected. This is a "user-chooses" bundle. With a generalized subscription, the user selected which articles were accessed, from across all Elsevier titles, after the user had subscribed. Once purchased the article was available to anyone in that user community.

    Per-article: A user could purchase unlimited access to a specific article for $7/article. This option was designed to closely mimic a traditional document delivery or interlibrary loan (ILL) product. With ILL the individual usually receives a printed copy of the article that can be retained indefinitely. This is different from the "per-use" pricing model often applied to electronic data sources. The article was retained on the PEAK server, but the user could access a paid-for article as often as desired. This was a "buyer-chooses" scheme, in that the buyer selected the articles purchased.

    The per-article and generalized subscription options allowed users to capture value from the entire corpus of articles without having to subscribe to all of the journal titles. Once the content is created and added to the server database and the distribution system is constructed, the incremental cost of delivery is approximately zero. Therefore, to create maximal value from the content, it is important that as many users as possible have access. The design of the price and bundling schemes affected both how much value was delivered from the content (the number of readers), and how that value was shared between the users and the publisher.

    Institutional generalized subscriptions may be thought of as a way to pre-pay for individual document delivery requests. One advantage of generalized subscription purchases for both libraries and individuals is that the "tokens" cost substantially less per article than the per article license price. By predicting in advance how many tokens would be used (and thus bearing some risk), the library could essentially pre-pay for document delivery at a reduced rate. However, unlike commercial document delivery or an interlibrary loan, all users within the community have ongoing unlimited access to the articles obtained with generalized subscription tokens. Thus, for the user community, a generalized subscription combines features of both document delivery (individual article purchase on demand) and traditional subscriptions (ongoing shared access). One advantage to a publisher is that generalized subscriptions represent a committed flow of revenue at the beginning of each year, and thus shift some of the risk for usage (and revenue) variation onto the users. Another is that they allow access to the entire body of content to all users and, by thus increasing user value from the content, provide an opportunity to obtain greater returns from the publication of that content.

    A simplified example illustrates why a library might spend more purchasing generalized subscriptions than traditional subscriptions. Consider a traditional subscription with 120 seller-specified articles, selling for $120, and a generalized subscription that allows readers to select 120 articles from the larger corpus, also for $120. Suppose that in the traditional subscription, users get zero value from half of the articles, but something positive from the other half. Then, the library is essentially paying $2 per article for which users have positive value. In this example, a cost-benefit oriented library would only purchase traditional subscriptions as long as the average value for the articles users want is at least $2 each. In a generalized subscription, however, users select articles that they actually value (they are not burdened with unwanted articles), so the average cost is $1 per article the user actually wants to read. The library then might justify a budget that continues buying generalized subscriptions to obtain the additional articles that are worth more than $1 but less than $2 to users. The result is more articles and more publisher revenue than with traditional subscriptions. Of course, the library decision process is more complicated than this, but the basic principle is that users get more value for the dollar spent when they—not the sellers—select the articles to include, and thus, since additional spending creates more user value wth the generalized subscription, over time the library might spend more.

    The twelve institutions participating in PEAK were assigned randomly to one of three groups, representing three different experimental treatments, which we labeled the Greeen, Blue and Red groups. Users in every group could purchase articles on a per-article basis. In the Green group they could also purchase institutional generalized subscriptions; in the Blue group they could purchase traditional subscriptions; in the Red group they could purchase both traditional and generalized subscriptions in addition to individual articles.

    Regardless of treatment group, access was further determined by whether the user had logged in with a PEAK password. Use of a password allowed access from any computer (rather than only those with authorized IP addresses) and, when appropriate, allowed the user to utilize a generalized subscription token. The password authorization protected institutions from excessive usage of pre-paid generalized subscription tokens by walk-in users at public workstations. The password was also required to purchase articles on a per-article basis (to secure the financial transaction) and to view previously purchased articles (to provide some protection to the publisher against widespread network access by users outside the institution).

    The password mechanism was also useful for collecting research data. Usage sessions were of two types: those authenticated with a PEAK password and those not. For unauthenticated sessions, the system recorded which IP addresses accessed which articles from which journal titles. When users employed their PEAK passwords, the system recorded which specific users accessed which articles. Uses that resulted in full-article accesses were also classified according to which product type was used to pay for access. Because we recorded all interface communications, we were able to measure complex "transactions". For example, if a user requesting an article (i.e., clicked on its link) was informed that the article was not available in the (traditional or generalized) subscription base, we could distinguish between whether the user chose to pay for the article on a per article basis or decided to forego access.

    An important consideration for both the design and analysis of the experiment was the possibility of learning effects during the life of the project. If present, learning makes it more difficult to generalize results. We expected significant learning about the availability and use of the system due to the novelty of the offered products and the lack of user experience with electronic access to scholarly journals. To decrease the impact of learning effects, we worked with participating institutions to actively educate users about the products and pricing prior to implementing the service. Data were also collected for almost two years, which enabled us to isolate some of the learning effects.

    To test institutional choices (whether to purchase traditional or generalized subscriptions, and how many), we needed participation from several institutions. Further, to explore the extent to which electronic access and various price systems would increase content usage, we needed a diverse group of potential (individual) users. Therefore, we marketed the project to a diverse group of institutions: four-year colleges, research universities, specialized research schools, and corporate research libraries. A large user community clearly improved the breadth of the data, but also introduced other complications. For example, user behavior might be conditioned by the characteristics of the participating institutions (such as characteristics of the institution's library system, availability of other electronic access products, institutional willingness to reimburse per-article purchases, etc.).