6.6 Improving library budgeting with usage information
Librarians are in an unenviable position when they select subscriptions to scholarly journals.  They must determine which journals best match the needs and interests of their community subject to two important constraints. The budgetary constraint has become increasingly binding because renewal costs have risen faster than serial budgets Haar (1999). The second constraint is that libraries have incomplete information about community needs. A traditional print subscription forces libraries to purchase publisher-selected bundles of information (the journal), while users are interested primarily in the articles therein. Users only read a small fraction of articles,  and the library generally lacks information about which articles the community values. Further compounding the problem, a library makes an ex ante (before publication) decision about the value of a bundle, while the actual value is realized ex post.
The PEAK electronic access products relaxed these constraints. First, users had low-cost access to articles in journals to which the institution did not subscribe. This appeared to be important: at institutions that purchased traditional subscriptions, 37% of the most accessed articles in 1998 were outside the institution's traditional subscription base. This figure was 50% in 1999. Second, the transaction logs that are feasible for electronic access allowed us to provide libraries with monthly reports not only on which journals their community valued, but also which articles. Detailed usage reporting should enable libraries to provide additional value to their communities. They can better allocate their serials budgets to the most valued journal titles or to other access products.
In this section we present analyses of the extent to which improved information available from an electronic usage system could lead to reduced expenditures and better service.
Improved budgeting with improved usage forecasts
We first estimate an upper bound on how much the libraries could benefit from better usage data. We analyze each institution's accesses to determine what would have been its optimal bundle if it had been able to perfectly forecast which material would be accessed. We then calculate how much this bundle would have cost the institution, and compare this perfect foresight cost with the institution's actual expenditures. Obviously even with extensive historical data, libraries would not be able to perfectly forecast future usage, so the realized efficiencies from better usage data would be less. Below we analyze how the libraries used the information from 1998 to change their purchasing decisions in 1999.
We present these results by access product in Table 6.10. We found that actual expenditures were markedly higher than optimal purchases in 1998. In particular, institutions in the Red and Blue groups purchased far more traditional subscriptions than would be justified if they had perfect foresight. Most institutions purchased more generalized subscriptions than would have been optimal with perfect foresight. We believe that much of the budgeting "error" can be explained by a few factors:
First, institutions overestimated demand for access, particularly for journals for which they purchased traditional subscriptions. 
Second, institutional practices, such as "use it or lose it" budgeting and a preference for fixed, predictable expenditures, might have affected decisions. A preference for predictable expenditures would induce a library to rely more heavily on traditional and generalized subscriptions, and less on reimbursed individual article purchases or interlibrary loan.  However, Kantor et. al. (2001) Kantor et al. (this volume) report the opposite: that libraries dislike bundles because they perceive them as forcing expenditures for low-value items.
Third, because demand foresight is necessarily important, libraries might want to "over-purchase" to provide insurance against higher than expected usage demand. Of course, per-article purchases (possibly reimbursed to users) provide insurance (as does an interlibrary loan agreement), but at a higher cost per article than pre-purchased generalized subscription tokens, or than traditional subscriptions.
|Year||Instid||Actual||Optimal||Actual||Optimal||Actual||Optimal||Actual||Optimal||$ Savings||% Savings|
|Change in expenditure 1998-99|
We also analyzed changes in purchasing behavior from the first to the second year of the project. The PEAK team provided participating institutions with regular reports detailing usage. We hypothesized that librarian decisions about purchasing access products for the second year (1999) might be consistent with a simple learning dynamic: increase expenditures on products under-purchased in 1998 and decrease expenditures on products they over-purchased in 1998. For each institution we compared the direction of 1998-99 expenditure change for each access product to the change we hypothesized.  We present the results in Table 6.11.
Six of the nine institutions adjusted the number of generalized subscriptions in a manner consistent with our hypothesis.  Fewer adjusted traditional subscriptions in the predicted direction. Two of the seven institutions that purchased more traditional subscriptions in 1998 than was ex post optimal then decreased the number purchased in 1999. Indeed, only three of the eight institutions made any changes at all to their traditional subscription lineup. This suggests an inertia that cannot be explained solely by direct costs to the institution. Perhaps libraries see a greater insurance value in having certain titles freely available through traditional subscriptions than from having generalized subscription tokens available that can be used on articles from any title. Generalized subscription tokens are also more expensive per article than traditional subscription prices, so the libraries are purchasing more potential usage with their budgets. Another explanation might be that libraries were more cautious about purchasing generalized subscriptions because it was a less familiar product.
|Independent variable||Coefficient (standard error)|
We performed a regression analysis to assess the differences between apparent over-purchasing in 1998 and 1999. Our dependent variable was the difference between the perfect forecast expenditure and actual expenditure, which we call the "forecast error". In Table 6.12 we report the effects of learning (the change in the error for 1999) and the average differences across experimental groups. The perfect foresight overspending over the life of the project averaged between 53% (Red) and 86% (Blue). However, the overspending was on average 36 percentage points lower in 1999. This represents a reduction of about one-half in perfect foresight overspending. 
We also considered other control variables, such as the institution's level of expenditures, fraction of the year participating in the experiment and number of potential users, but their contribution to explaining the forecast error was not statistically significant. The between-group variation and the 1999 improvement account for about 85% of the variation, as measured by the R2 statistic.
Decisions about specific titles
In addition to comparing the total number of subscriptions for an institution with the optimal number, we can also identify the optimality for each particular title subscribed. We calculate, based on observed usage and prices, which titles an institution with perfect foresight should have obtained through traditional subscriptions, and call this the optimal set. Then we calculate two measures of actual behavior. First, we determine which titles in the optimal set an institution actually purchased. Second, we determine which traditional subscription titles the institution would have been better off foregoing because actual access would have been less expensive using other available access products.
In Table 6.13 we present our analysis of the traditional subscription titles selected by institutions. There is wide variation both in the percent of purchased subscriptions that are in the optimal set, and in the percent of journals in the optimal set to which the institution did not subscribe,  Overall, there is substantial opportunity for improvement. This is not a criticism of institutional decisions. Rather, it indicates the opportunity for improved purchasing decisions if libraries obtain the type of detailed usage information PEAK provided.
We do generally see better decisions in 1999. However, in both years a rather large percentage of subscribed journals were not accessed at all.
|Institution||Year||Total subscriptions||Percent subscribed that are in optimal set||Percent of optimal set that were not subscribed||Percent of subscriptions accessed at least once|
Dynamic Optimal Choice
Access product purchasing decisions made by institutions have a profound impact on the costs faced by users, and thus on the realized demand for access. Therefore, in deciding what access products, electronic or otherwise, to purchase, an institution must not only consider the demand realized for a particular level of user cost, but also what would be demanded at differing levels of user costs. Likewise, in our determination of the optimal bundle of access products, we should not take the given set of accesses as fixed and exogenous. As a simple example, let us assume that a subscription to a given journal requires 25 accesses in order to pay for itself. Now assume that the institution in question did not subscribe to that journal, and that 20 tokens were used to access articles in the time period. At first look, it appears as though the institutions did the optimal thing. Let us assume, however, that we know that accesses would increase by 50%, to 30, when no password is required. It now appears as though the institution should have subscribed, since the reduced user costs would stimulate sufficient demand to justify these higher costs.
|Institution||Year||Trad. Subscriptions||Addit. Articles||Increase||Total|
|Actual Optimal||Rescaled Optimal||Actual Optimal||Rescaled Optimal||Optimal Cost||Access Increase|
In Table 6.4 we reported results that allow us to estimate how much usage would increase if no passwords or other user costs were incurred. We now calculate the product purchases that would have optimally matched the usage demand that we estimate would have occurred had the library removed or absorbed all user costs. We report the results in Table 6.14.  For most institutions, the optimal number of journal subscriptions increases, because greater usage makes the subscription more valuable. In general, the estimated institution cost of the optimal bundle would not increase greatly to accommodate the usage increase that would follow from eliminating user costs. Although we cannot quantify a dollar value for the eliminated user costs (because they include nonpecuniary costs such as those from requiring a password), we show in the last two columns that the modest institutional cost increase would be accompanied by comparable or larger increases in usage. The greatest cost increase (48%) occurs for the institutions (14 and 15) at which generalized subscription tokens were not available and the institution did not directly subsidize the per-article fee, i.e. at those institutions where users faced the highest user costs. Thus, the higher institutional costs should be weighed against high savings in user costs (including money spent on per-article purchases).