Author: Martine Cocaud
Title: From the Text to the Data Base: the Libriciel as a Tool for Producing and Diffusing Scientific Texts. A Case Study of Breton Hagiography
Publication Info: Ann Arbor, MI: MPublishing, University of Michigan Library
April 2001

This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact for more information.

Issue : The editorial chain which starts from the writing of a text to its distribution and reading has been significantly affected by the development of electronic communication. The "libriciel" — a combination of book and software — presented here applied to Breton Hagiography, outlines a new technique based on the automatic transformation of the "tapuscrit" (word-processed document) in hypertext and in a database that can be distributed on the Internet. The aim is to preserve the traditional methods of text writing while at the same time having the advantages of electronic media distribution.

.01. Introduction

Hagiography remains a subject of permanent interest for the people of Britanny because of the curiosity of the researchers and the interest of a wide public curious about Breton culture. Questions about the origins of a breton first name or a place name seen during a walk are often asked of the researchers who specialize in the cultural history of Britanny. There are a lot of written works related to the saints of Britanny because this topic was a center of interest during the XIXth century, but they are difficult to find and are not always reliable. None of these editions give all the information about the cults of the saints: icones, inscriptions, toponymic indications, etc.

Including at the same time the curiosity of the public and the difficulty of giving a good answer , the cultural Institute of Britanny (Institut Culturel de Bretagne — ICB) encouraged the creation of a database about Breton saints. The project requires research from various sources: works of the scolars of the XIXth and the XXth centuries, books and traditional customs, dictionaries of religious history, iconographic collections, and so on. In the end more than 500 saints should be researched. Many specialists of the different periods of Breton cultural history will be involved in this project.

Bernard Merdrignac is in charge of the project (professor of medieval history at Rennes II University and member of the CRHISCO). He directed many students who carry out the research and verify the data. First the students have to create a file on each saint (30,000 to 40,000 characters per file). Presently we have more than a hundred files. This is good exercise for the student's training.

However, the aim of this project is not to write a new book on a topic widely illustrated but to give people scientific data that must be frequently up to date (mise à jour). This requirement is hardly compatible with a paper edition. We have built a new concept: "le libriciel" which is at the same time a book and a software program. It can be consulted on the web.

Here is the process of "le libriciel":

  • We use a word processor to enter data (WORD). Instructions of formatting are extremely precise; we use WORD styles.
  • The text is "marked" as if to indicate access to an index. The indexed terms are used to obtain the information from the database throught the web.
  • Programs are able to translate WORD text into hypertext and into a documentation database (WAIS database). Both are loaded on the web server.

Users can request information from the database on several topics. This way the texts of the file pop up . This method provides two benefits: users can consult the database from anywhere and it is easy to update the data (if we change the WORD file, we just have to reload the database). We will explain how the database was created in this communication.

.02. How to Produce the Data

1) Standardization:

First, we have to standardize the files with precise rules because many people are involved in the production of the database. The instructions of data entries are about:

  • List of authorized abreviations.
  • The orthography of the place names (we use the INSEE nomenclature). We have chosen the following order: name of the "commune," name and registration number of the "regional department." For foreign countries we have used : district, region, country.
  • Presentation of dates.
  • Presentation of the bibliography.
  • Presentation of quotations.

We also specify the various headings of the files . The aim is to lead to a WORD model which is given to the students at the beginning of their training. The students fill in the model and save it in RTF format. This is what they send through the web.

2) Defining topics:

It is necessary to define the criteria used for questioning before creating the index : the text must be "marked" and associated with the criteria (cf appendix). They are difficult to define because this data base is intended, at the same time, for advanced researchers and the general public — too many criteria makes the database difficult to use and not enough limits its scope. 16 criteria have been defined for the moment. Later we will define the modes "quick search" and "in - depth research" for the more advanced researchers.

.03. The "Libriciel"

It is at the same time a book and a software program (as its name indicates). The text is essential. After verification the student's document is completly loaded on the server . It can be consulted in its entirety by any user who uses the question tools for obtaining the information.

After having written a document on a particular saint, a student indexes his document. He uses WORD technical (cf: exemple 1). It is the longest step in the process because it is necessary to index each term separately. Then a macro can automatically select the indexed terms (a macro is a small program made up of WORD instructions) and cut the text into sections (each section corresponds to a particular saint). The macro calculates the computer address where the section is located on the server. The address and the indexed terms are saved in an index file (see example 2 above). The saint file is converted to the HTML format by the RTFTOHTML program.



fête=6 décembre

qualité=Moine missionnaire;archevêque




m_lit=chronique de Dol;Bréviaire de Dol de 1519;Supplementum missalis ad normam;psautier du XVè siècle de l'abbaye de Saint-Jacut-de-l'Isle;Missel du vice-chancelier Ynisan;Cartulaire de Quimperlé;Missel de Vannes;Bréviaire imprimé de Léon de 1516;Bréviairede Saint-Brieuc de 1548

patron=pêcheurs d'épaves;pilleurs de côtes

reli=Saint-Barthélémy de Paris;Brech;Plourin

typ_ico=statues;sculptures;vitraux;fresques et peintures;orfèvrerie

We will observe that the adress (URL) of Budoc section is followed by the entry of index "noms " which takes all the variant of the first name BUDOC, after we find the entry : "Fête," etc. All the italics terms are entries of Index. They will be used to consult the database.

.04. Construction of a WAIS database

The WAIS database is created from the index file previously contructed. Software FREWAIS-SF developed at the University of Dortmund is a tool used for indexing and searching for "information client/server." It allows the creation of textual databases. Later, the users will be able to search in this database from "fields," because this software produces "opposite files" (fichiers inverses). The entire database — notes and index files — became a WAIS database. In the whole of the data (the notes HTML and the " file index ") programs writen in Perl language, are the intermediate files necessary to program FREEWAIS. Moreover, the PERL programs generate the search interface by taking account of the index extracted from the WORD text.

.05. Les Outils Permettant La Consultation

The last element of the application is a module which allows the interrogation of the base. The base is consultable via the Internet . To consult the base, it is possible to write the name of a saint, or a place, or a speciality. It is also possible to choose on a list: a list of Saint names for example, or a list of localization, or a list of Saint spécialites, etc. Then we select the choice with a click of the mouse. In answer to a question, we always obtain the text written by the students. This text has the shape of a hyper-text. Persons who are authorized can also update the database via the Internet.

.06. Conclusion

Thanks to the "libriciel," the text is the pivot of this hagiographic application. Thus, the historian finds his usual working tool, word processing, like its practices of drafting. Moreover, the text presented to the public is really that of the author. We are satisfied enough of the result, but the project is 4 years old, and it is worth thinking over:

  • The technical choice :
    • software RTFTOHTML is unstable
    • WORD moves too quickly (language BASIC of the macros).
    • Perhaps, it would be better to mark the text with the language XML
    • Oracle would be perhaps preferable?
  • The implementation of such a project exceeds the capacities of work of three "teachers-researchers": we have to find a partner for the financing.



Secondly, if you "click" on Saint EVEN, you see this text :



* .forme latine : 


* formes autres :

Even, Ewen


Il est possible que plusieurs saints personnages portent ce nom.

Il a été confondu avec saint Yves lorsque que l'Église a voulu substituer le culte de ce dernier à celui de saints inconnus.



3) The Project

Martine Cocaud
Maître de conférences en Histoire Contemporaine
Université Rennes 2