Do developing countries profit from free books? Discovery and online usage in developed and developing countries compared
Skip other details (including permanent urls, DOI, citation information)
This work is licensed under a Creative Commons Attribution-NonCommercial 3.0 United States License. Please contact firstname.lastname@example.org to use this work in a way not covered by the license. :
For more information, read Michigan Publishing's access and usage policy.
For years, Open Access has been seen as a way to remove barriers to research in developing countries. In order to test this, an experiment was conducted to measure whether publishing academic books in Open Access has a positive effect on developing countries. During a period of nine months the usage data of 180 books was recorded. Of those, a set of 43 titles was used as control group with restricted access. The rest was made fully accessible.
The data shows the digital divide between developing countries and developed countries: 70 percent of the discovery data and 73 percent of online usage data come from developed countries. Using statistical analysis, the experiment confirms that Open Access publishing enhances discovery and online usage in developing countries. This strengthens the claims of the advocates of Open Access: researchers from the developing countries do benefit from free academic books.
The discussion on Open Access (OA) has many aspects; one of those aspects is the digital divide between developed and developing countries. The digital divide is defined as the inequality in access to the Internet, both in a technical sense—a less than optimal infrastructure—and in lack of knowledge to make the best use of the available online resources. OA is seen as a way to lower financial barriers for scientists and other readers. Recently, this was discussed by Swan and Hall , who conclude that while putting the idea into practice is not simple, the growth of OA is not only inevitable but also desirable.
Before them, several others also saw chances in freely accessible scientific publications. While each author discusses the inequalities from different angles, the possibilities of OA publishing—combined with changes in institutional and political structures—offer a chance for improvement. Ahmed discusses the digital divide in Africa in great detail and identifies the required policy changes to amend it.  Salager-Meyer discusses the inequalities that exist in academic publishing between the developed and the developing countries. Her focus is on journal publishing.  Christian also discusses the inequalities in funding, IT related infrastructure, and possible misconceptions about OA.  Likewise, Papin-Ramcharan and Dawe report on the difficulties that arise when funding is not adequate for publishing in OA journals that charge an author’s fee.  In the chapter “Development” Willinsky and Parry describe the difficulties that university libraries in developing countries face and propose developing an OA publication model as a possible remedy.  Armstrong and Ford focus on the intellectual property rights by discussing the effects of WIPO treaties in contrast to licenses based on Creative Commons.  Chan and Costa review several program such as HINARI, AGORA, eIFL.net, and PERI and compare them to directly publishing in OA journals and “green” OA.  And Ghosh and Kumar Das conclude in their extensive overview that India is leading the OA movement among the developing countries and—by doing so—it is making the developed countries aware of the qualities of scholars and scientists from the developing countries. 
Very little research is published on the effects of OA publishing on developing countries, mostly on the citation impact of freely accessible articles. Calver and Bradley investigated citations of OA and non-OA papers in six journals and four books published since 2000, in the field of conservation biology. They did find an OA citation advantage for book chapters, but the number of citations papers or chapters received from authors in developing countries did not increase.  Norris, Oppenheim, and Rowland, however, did see a larger percentage of citations from developing countries given to OA articles in the field of mathematics than is the case for citations from developed countries.  Walker describes the growth of Bioline International, which enables OA publishing of journals from a wide range of developing countries. Apart from usage data, she describes the citation advantage enjoyed by OA articles. 
No research was found on usage of articles or on academic books.
Open Access monographs and the digital divide
This article tries to answer the question of whether OA publishing does actually help to lessen the digital divide between developed and developing countries. As will be described in more detail below, usage data of an earlier experiment with OA monographs was combined with geographical data: from which country does the traffic originate? All countries were divided into two groups: developing countries and developed countries. In order to find whether OA does have a positive effect on developing countries, a group of titles with restricted access was compared to another group of fully accessible monographs. Using statistical analysis, the percentages of book discovery and usage were compared. If the percentages of the group of fully accessible titles are significantly higher, the claim that OA does benefit developing countries may be supported.
The collection of monographs used for this experiment was published by Amsterdam University Press (AUP), an academic publisher mainly of books in the field of humanities and social sciences. AUP is owned by the University of Amsterdam and works on a not-for-profit basis.  AUP publishes around 200 books per year, combined with several journals, some of which are published both on paper and online, some as an OA e-journal. AUP has coordinated the OAPEN (Open Access Publishing in European Networks) project where several academic publishers worked together to develop an OA business model for monographs in humanities and social sciences, combined with the creation of an OA library.  From spring 2011, OAPEN continued as a separate organization with AUP as one of its shareholders.
In 2009, an experiment was conducted at AUP to measure the impact of OA publishing of academic books.  During a period of nine months three sets of 100 books were disseminated through an institutional repository, the Google Book Search program, or both channels. A fourth set of 100 books was used as control group. As one of the research questions concerned the role of dissemination channels, this division was used.
One of the findings was that OA publishing enhances discovery and online usage of academic books regardless of the dissemination channel used. Therefore, in this article the titles will be divided into a group of freely accessible titles—without taking into account the dissemination channel—and a closed access group. From April 2009 until December 2009, access to the 400 publications was strictly controlled. Since then, access to several titles has changed, which strongly impacts the discovery and online usage.
While the experiment confirmed that books in OA were found more and were used more, it was not known who was using them. The Google Book Search program enabled publishers to monitor geographic information: how many times are books opened from which country? Therefore, in the first months of 2011 this data was gathered and combined with the existing data to answer the research question: does a change in accessibility of academic books have an effect on developing countries?
From this question, two hypotheses were derived:
Hypothesis 1: The discovery of fully accessible titles in developing countries is significantly higher, compared to titles that are not fully accessible. Discovery is measured as the number of “Book visits” a title receives in the Google Book Search program. Book visits are defined as each time that a unique user views a book. 
Hypothesis 2: The online usage (i.e., pages read) of fully accessible titles in developing countries is significantly higher, compared to titles that are not fully accessible. Online usage is measured as the number of page views a title receives in the Google Book Search program. Page views are defined as the number of unique pages a user views within a 24-hour period. Regardless of the number of times that a unique user views a page, it is only registered once. 
Setup of the experiment
The first question to be answered of course is which countries are developing countries. Countries differ wildly in all aspects, and deciding which factors are used to decide which country belongs in what group is not easy. For this experiment, all countries listed under “Emerging and Developing Economies” in the World Economic Outlook Database April 2010 are used—with Somalia added to the list.  The web statistics revealed traffic coming from 179 different countries. Less than a third of those—48 countries—are marked as developed countries, although those countries generate 70 percent of the discovery data and 73 percent of online usage data.
Dividing all countries into two groups is of course a simplification. Doing so enables us to scale down a problem of enormous complexity to a relatively simple question. At this point, quantitative data on the effects of OA on the use of monographs is scarce, especially the use in developing countries.
In order to enable further research, the data for the titles is available.
The experiment consists of creating four equal sets of 100 titles; each title is placed in one of four sets. The different sets are defined using two variables: accessibility and channel. Each set is disseminated using a specified channel and accessibility settings. For a period of nine months, starting in April 2009, the effect on discovery and online usage is measured. Discovery and online usage are measured using the number of views and downloads from the respective channels.
The division of titles can be summarized as follows:
|Set 1||Set 2||Set 3||Set 4|
|Fully accessible in Google Book Search||No||No||Yes||Yes|
|Fully accessible through the AUP repository||No||Yes||Yes||No|
Set 1: Available “as usual.” An electronic version of almost all books by AUP is submitted to the Google Book Search website. By default, AUP allows a user of Google Book Search to see only 10 percent of the book’s contents. The full content of each book is indexed by the Google search engine. The titles in this set are not uploaded into the AUP repository. The accessibility of this set is the lowest.
Set 2: Freely available via the repository; visible for 10 percent in Google Book Search. The titles of this set are uploaded in the AUP repository. For each title, a record is created in the repository database containing metadata and an electronic version of the book. The “visibility settings” of Google Book Search are not changed and remain at 10 percent.
Set 3: Visible for 100 percent in Google Book Search and freely available via the repository. The titles of this set are uploaded in the AUP repository, and the “visibility settings” of Google Book Search are set to 100 percent. The titles in this set are fully accessible through both channels. The accessibility of this set is the highest.
Set 4: Visible for 100 percent in Google Book Search; not available via the repository. For this set, the “visibility settings” of Google Book Search are set to 100 percent. The books are not placed in the AUP repository.
As all titles are accessible through the Google Book Search program, the interest of readers can be measured with the Book Search usage statistics. Here, “‘Book visits”—measuring the number of times the web page of the book is accessed—and “page views”—which measure the number of pages opened—are the statistics used. Geographical usage data was also available in the Google Book Search program. For each title, a monthly report was downloaded, containing the percentages per country. The statistics per country are measured by applying the percentages to the absolute number of Book visits and page views.
When the statistics from each country per title are known, the percentage of usage coming from developing countries is measured. For instance, the title The Making and Unmaking of an Industrial Working Class : Sliding down the Labour Hierarchy in Ahmedabad, India was accessed online from 40 different countries between April and December 2009. This resulted in 418 Book visits and 5,115 page views. The number of Book visits from developing countries was 208, and the number of page views from developing countries was 2,737. So, for this title the percentage measured for Book visits is 49.8 and the percentage for page views is 53.5. It may not come as a surprise that traffic from India explains the high percentages for this particular title. As I will explain later, this percentage is an exception: for most titles there is a large gap between usage by developed countries and developing countries.
Furthermore, this paper will not discuss in detail the differences in discovery and online usage from individual countries. For that, several variables that are not part of the data must be examined. One of the criteria is the role of English within the academic communities. While English is widely used, it may not be the preferred language in all communities. The data set does not contain books in French, Spanish, Portuguese, or Mandarin, to name a few languages. Another criterion to examine is the subject of the books. While the examined books cover a wide range of subjects, it may be possible that certain research communities would not be as interested in the provided titles as others. As described below, subject is one of the elements used in the selection of titles.
Selection of titles and removal of bias
In April 2009, 893 titles were available at AUP. Using commercial availability, imprint, publication date, and series this list was reduced to 412 ISBNs. Of this list, 22 titles were published both as a hardback and a paperback book. As this distinction is irrelevant in the digital domain, 11 ISBNs were removed from the list. Then the oldest title—published in 1994—was removed, resulting in a list of 400 titles.
Considerable effort has been put into the removal of bias. This experiment operates using four sets of books; therefore these sets must be as equal as possible. Each of the 400 books is compared using the following criteria: subject, type of work, language, expected sales, and publication date.
In the database of AUP each title is assigned several subject codes describing the content. For the sake of the experiment, all titles from a “subject based series” were assigned the same subject codes. Furthermore, the number of subject codes was reduced in order to create relatively large groups with the same subject. The same principle was applied to the expected sales, measured by the print run. While each individual title may have a different print run—from 0 for Print on Demand titles to 6,500—an amount rounded up to the next 500 was used. This created again relatively large groups, which could be evenly divided over the four sets. Also the publication year and the language of the title were taken into account and were “spread” as evenly as possible.
The 400 titles are written in three languages: Dutch (212 titles), English (180 titles), and German (8 titles). One could argue that using a large percentage of Dutch language titles favors the usage from Belgium and the Netherlands. Furthermore, German is mostly spoken in European countries. For this reason the selection is reduced to the English language titles only. Language is one of the criteria used to create equal sets, so excluding Dutch and German still leads to a balanced distribution: Set 1 contains 43 titles; Set 2 contains 49 titles; Set 3 contains 42 titles; and Set 4 contains 46 titles.
As stated before, previous research found that OA publishing enhances discovery and online usage of academic books, regardless of the dissemination channel used.  Therefore, the statistical analysis will be conducted on the average measurements of the OA channels versus the data from the closed access channel.
Research results and documenting the digital divide
A first analysis of the data clearly shows the digital divide between developed and developing countries. When looking at the discovery of books, only 30 percent of the Internet traffic comes from developing countries. This is in stark contrast with the United States, from where 19 percent of all traffic originates. Table 2 depicts the five highest ranking developed and developing countries, and Figure 1 illustrates this.
|Country||Discovery: Book visits||Percentage|
|Other developed countries||30652||21%|
|Other developing countries||20023||14%|
The same story could be told about online usage. Here, the percentages are almost equal: the developing countries account for 27 percent of the total number of page views, coming from 30 percent of all Book visits. Again, the country with the largest usage percentage is the United States with 18 percent. Also, the second and third largest portions come from the same countries: United Kingdom and the Netherlands. See Table 3 for the five highest ranking developed and developing countries and Figure 2 for more details.
|Country||Online usage: page views||Percentage|
|Other developed countries||396584||27%|
|Other developing countries||244135||16%|
The previous analysis did show the large gap between the developing and the developed countries, but the effect of OA publishing was not taken into account. The experiment run on 400 titles—see —revealed that discovery and online usage of books were enhanced. When looking at the 180 English language titles, the same pattern emerges: books that are freely accessible online are found more and used more. This effect can be found with the developed and the developing countries.
Again, the digital divide is also clearly visible. The average number of Book visits is used as measure for the discovery rate of books. When those books are published in closed access, the average rate from developing countries is 127 versus 491 in developed countries; in other words, 20.6 percent of the average Book visits come from developing countries. OA leads to higher average rates: 611 for developing countries versus 1,291 for developed countries; the developing countries are responsible for 32.1 percent.
Online usage is affected in the same way: the average number of page views of books in closed access is 1,526 for developing countries and 4,542 for developed countries; the percentage for developing countries is 25.1 percent. Making books fully accessible leads to an average of 5,357 page views from developing countries, compared to 14,278 page views in developed countries; the percentage for developing countries rises to 27.3 percent. This is illustrated in Figure 3 and Figure 4.
The research question “does a change in accessibility of academic books have an effect on developing countries?” was translated into two hypotheses. The experiment’s data was analyzed using ANOVA (analysis of variance) in order to test the hypotheses. The results are summarized in Table 4.
|Hypothesis 1: The discovery of fully accessible titles in developing countries is significantly higher, compared to titles which are not fully accessible.||There was a significant effect of accessibility on discovery in developing countries, F(3,176) = 1.76, p < .05, one-tailed.|
|Hypothesis 2: The online usage (i.e. pages read) of fully accessible titles in developing countries is significantly higher, compared to titles which are not fully accessible.||There was a significant effect of accessibility on online usage in developing countries, F(3,176) = 1.78, p < .05, one-tailed.|
Discussion of the results
Hypothesis 1 states that the discovery of fully accessible titles in developing countries is significantly higher compared to titles that are not fully accessible. Discovery was measured as the percentage of Book visits emerging from developing countries that a title received in the Google Book Search program during the experimentation period. The results of the experiment confirmed the hypothesis, which strengthens the claims of the advocates of OA: access to researchers from the developing countries is improved.
The results are also in line with predictions from the library and information sciences and the field of e-commerce. In the library and information sciences, accessibility to scientific output is linked to research impact. This is discussed by Harnad et al.   When barriers are removed, the output—in this case, academic books—is used to maximum effect. The field of e-commerce uses a concept called search costs, which acts as a barrier to transactions. For a more comprehensive discussion of this concept, see Bakos and Granados, Gupta, and Kauffman.   Making the complete content of a publication available should lower the search costs considerably, especially if search engines have complete access as well, which may lead to easier discovery of the book. Here, the transaction is acquiring an academic book. Publishing in OA lowers those barriers, which indeed has positive effects.
Book visits are used as an approximation to discovery: it was not possible to measure if a Book visit occurred by a “new” reader or by a “returning” reader. Therefore we cannot state that 204 Book visits from developing countries are equal to 204 new readers of that title. If we assume that a percentage of those Book visits are made by returning readers, the differences in Book visits between titles published in closed access and titles in OA still convey relevant information on the discovery rate. Further research is needed to measure the percentage of new versus returning readers and whether accessibility influences this.
Hypothesis 2 states that the online usage (i.e., pages read) of fully accessible titles in developing countries is significantly higher, compared to titles that are not fully accessible. The results of the experiment confirmed the hypothesis, which is—again—in line with expectations. Online usage is of course closely linked to the amount of information that is directly available. It should therefore not come as a surprise that making a book fully accessible online leads to more pages read. It is interesting to note, however, that the average number of pages read in developed countries is much higher than in developing countries. Presumably the differences in infrastructure play an important role here.
The two confirmed hypotheses refer to data at a high aggregation level; individual countries are not compared for reasons that were discussed earlier: the role of available languages in the data set and the diversity of subjects. While bias has been removed as much as possible, the data set may be relatively small. One could consider the contents of the OAPEN Library, containing hundreds of titles published by dozens of different publishers. However, this collection does lack a control group, making it harder to draw conclusions based on its performance. On the other hand, research on freely accessible books by Hilton and Wiley was done on 41 titles, of which 7 were nonfiction books. 
In the introduction the technical and cultural barriers to the use of OA were discussed. Online access to information resources does require an infrastructure that supports it. Furthermore, lack of knowledge or cultural biases may impede the usage of OA. Because of the way this experiment was set up, these factors do not play a significant role. The technical requirements for finding and using the titles from both the freely accessible group and the control group were exactly the same: all were available through the same dissemination channel—the Google Books Search program. Also, all titles were available in the same time period. The nontechnical barriers may have played a role, but if that was the case their influence would be the same on all books. As we have seen in the discussion of the selection of the titles, much effort is invested in the removal of bias. Therefore, the group of openly accessible books is balanced with the control group.
Research on the effects of free online accessibility of books is scarce, especially the effects on academic books. As OA is gaining momentum as a dissemination model—see for instance the briefing paper of Knowledge Exchange—there is greater need for knowledge of the effects it has on all stakeholders, both in developing and in developed countries.  The findings of this article reaffirm the notion that removing barriers to access has positive effects on discovery and online usage of academic books. This is beneficial for researchers from both developing and developed countries, and it does indicate that OA makes it possible to “[...] share the learning of the rich with the poor and the poor with the rich, [making] this literature as useful as it can be [...].” 
Furthermore, the data used reflects the situation of 2009. As described by UNESCO, several developing countries are investing heavily in Research and Development.  This will impact the discovery and online usage of academic books, and it will be interesting to see if the digital divide becomes smaller in the next few years.
About Ronald Snijder
Ronald Snijder joined AUP in 2007, where he is responsible for developing digital publications, combined with IT management. He is also technical coordinator at the OAPEN Foundation. Before that, he worked in several profit and not-for-profit organizations as an IT and information management specialist. Follow him on Twitter @ronaldsnijder.
A. Ahmed. “Open Access towards Bridging the Digital Divide—Policies and Strategies for Developing Countries.” Information Technology for Development 13, no. 4 (2007): 337–361. http://dx.doi.org/10.1002/itdj.20067
F. Salager-Meyer. “Scientific Publishing in Developing Countries: Challenges for the Future.” Journal of English for Academic Purposes 7, no. 2 (April 2008): 121–132. http://dx.doi.org/10.1016/j.jeap.2008.03.009
J. Papin-Ramcharan and R. A. Dawe. “The Other Side of the Coin for Open Access Publishing—A Developing Country View.” Libri 56, no. 1 (2006): 16–27. http://dx.doi.org/10.1515/LIBR.2006.16
L. Chan and S. Costa. “Participation in the Global Knowledge Commons: Challenges and Opportunities for Research Dissemination in Developing Countries.” New Library World 106, no. 1210/1211 (2005): 141–163. http://dx.doi.org/10.1108/03074800510587354
S. B. Ghosh and A. Kumar Das. “Open Access and Institutional Repositories—A Developing Country Perspective: A Case Study of India.” IFLA Journal 33, no. 3 (October 2007): 229–250. http://dx.doi.org/10.1177/0340035207083304
S. R. Walker. “Bioline International: A Case Study in Open Access and Its Usage for Enhancement of Research Distribution for Scientific Research from Developing Countries.” OCLC Systems & Services 25, no. 2 (2009): 125–134. http://dx.doi.org/10.1108/10650750910961929
R. Snijder. “The Profits of Free Books: An Experiment to Measure the Impact of Open Access Publishing.” Learned Publishing 23, no. 4 (October 2010): 293–301. http://dx.doi.org/10.1087/20100403
Google Books. “Reports for Previews—Books Help.” Online. Accessed 18 Apr. 2012. Available at http://support.google.com/books/direct/bin/answer.py?hl=en-GB&answer=106172.
S. Harnad et al. “The Access/Impact Problem and the Green and Gold Roads to Open Access.” Serials Review 30, no. 4 (2004): 310–314. http://dx.doi.org/10.1016/j.serrev.2004.09.013
S. Harnad et al. “The Access/Impact Problem and the Green and Gold Roads to Open Access: An Update.” Serials Review 34, no. 1 (2008): 36–40. http://dx.doi.org/10.1016/j.serrev.2007.12.005
J. Y. Bakos. “A Strategic Analysis of Electronic Marketplaces.” Management Information Systems Quarterly 15, no. 3 (September 1991): 295–310. http://dx.doi.org/10.2307/249641
N. F. Granados, A. Gupta, and R. J. Kauffman. “The Impact of IT on Market Information and Transparency: A Unified Theoretical Framework.” Journal of the Association for Information Systems 7, no. 3 (2006): 148–178.