Add to bookbag
Author: Michael Greenhalgh
Title: Learning Art History in Context: An Image Database & VRML Model of Borobudur
Publication info: Ann Arbor, MI: MPublishing, University of Michigan Library
April 2000

This work is protected by copyright and may be linked to without seeking permission. Permission must be received for subsequent distribution in print or electronically. Please contact for more information.

Source: Learning Art History in Context: An Image Database & VRML Model of Borobudur
Michael Greenhalgh

vol. 3, no. 1, April 2000
Article Type: Article
PDF: Download full PDF [346kb ]

Learning Art History in Context: An Image Database & VRML Model of Borobudur

Michael Greenhalgh

The paper discusses some of the new ways of learning Art History made possible by image-capable desktop computers, the Web, and the networks. Focussing on the VRML model of Borobudur prepared by Dr Ajay Limaye and the author, it describes ways of constructing panoramic images or models of sites and museums, and the advantages and drawbacks of the available software, as well as of handling large image databases and using them semi-automatically to populate HTML text pages. The conclusion is that such "wrap-around" technologies may be effectively used in lectures when we can get away from the projected-image and computer-monitor paradigm, replacing them.

01. Introduction

Learning the history of art or architecture has generally meant sitting in a darkened room looking at slides, or looking at pictures and reading the accompanying text in a book. Sometimes video can be used to explore the site or gallery where the artworks are to be found; but in all cases the student has little control over one or more of the timing, nature, content, direction or pace of learning.

One of the tentative promises proferred by digital multimedia is a new kind of learning in which the student is in control; in its extreme form, this sees the end of lectures, and the lecturer as some kind of animated signpost, simply indicating to the student the paths to follow. (Of course, this is to misunderstand the purpose and range of lectures.)

With the development of the web, and increasing access to hardware and software for digitizing and managing image collections, digital resources are becoming widely available. One opportunity provided by a variety of software is the visualization of contexts - of collections of materials (images, text, panoramas, with hyperlinks as appropriate) - which build in the computer a three-dimensional representation of objects or locations, from individual buildings or gallery interiors, to whole cities.

Requiring considerable work to meld into a coherent presentation, such a collection of visual resources, incorporating not only an image database but also a computer model, would seem to have advantages for learning: available 24 hours a day over the web or on CDROM, capable of modification and extension as resources and technologies permit, The Borobudur Project was begun to test the feasibility of developing such "immersive" web-based technology for learning art history. This paper describes the rationale behind the project, how it was constructed, its features and drawbacks, and concludes with an assessment of the use of such image-and-VRML projects for teaching and learning.

.02. Why Borobudur?

Borobudur is a large square-based stupa 40km to the north west of Yogyakarta, in a volcanic region on the Indonesian island of Java. Erected in the late 8th or early 9th century, presumably by the kings of central Java, this Buddhist monument (which was surely built to contain relics of the Buddha) was probably abandoned within not much more than a century after construction when the power-base moved to east Java. There is no foundation inscription, no way of dating beyond the palaeography of the workers' inscripitions, and no later mention of the sanctuary until 1709 AD. In Europe, no such sculptural complexes had been seen since well before the fall of the Roman Empire; and none would be seen until more than 100 years after its abandonment. In the region, it ranks with much larger complexes at Pagan (Burma) and Angkor (Cambodia).

One of the great monuments of Buddhist sculpture, the mere figures for Borobudur are stupendous, and make the effort involved in sculpture for a run-of-the-mill western cathedral look relatively puny:

  • stone embankment covering the basement: 11,600 cubic metres;
  • 1,460 narrative panels covering 1,900 square metres;
  • 1,212 decorative panels covering 600 square metres;
  • 100 monumental gargoyles to carry away the rainwater;
  • 432 Buddha images displayed from the galleries;
  • 72 Buddhas displayed in stupas on the great terrace;
  • 1,472 stupa-shaped ornaments.

Borobudur is easily visited from Yogyakarta, but these figures indicate the difficulty of comprehending all of it during a visit to what is now visible of the basements, to the four galleries, or to the stupa terraces at the summit. Hence the Borobudur Project aims to make it better known to students around the world by providing images of the sculptures and reliefs, and a VRML model, all available across the web at

As well as size, there are other reasons for choosing Borobudur as a project:

More is visible on the VRML model than can be seen on site.

If the quality and importance of the Borobudur sculpture are world-class, for the sheer abundance and beauty of its figured reliefs, decorated panels and sculptures, including the scenes from everyday life on the (largely) hidden base:


The same cannot be said for its engineering: within a few years, the whole stupa was sliding down to the ground, making what was no doubt intended as a tall and slender stupa into a squat one; drastic measures were taken, including the addition of a stone girdle around the base which, apart from a small corner, obscures a complete suite of reliefs even today.

Availability of a high-quality photographic survey.

Restorations of the monument during the 20th century also included a full suite of high-quality, published photographs of all the reliefs, scrubbed clean for the purpose, and the full suite of photos of the Hidden Base, not to mention the very quantity of reliefs on the monument, provides a good target for VRML and the HTML extensions it provides. For example, whilst no computer simulation can substitute for a visit to the monument itself, our VRML model provides an opportunity to examine the whole monument, or any of its details, at leisure, floating above the stupas if one so desires:


Moreover, because of the dour mid-dark-grey of its stone (originally all the reliefs might have been stuccoed and coloured), Borobudur is one of the few monuments that may reasonably be "rebuilt" using greyscale photographs. (To the visitor today, the reliefs do appear coloured - but it is the climate which ensures that mosses and lichens "colour" and also part-obscure the reliefs)

The mission of the institution in which I work.

Part of the mission of the Faculty of Arts at the Australian National University is develop ways of using the latest technology as an aid in teaching and learning ... range of methods for interactive learning (Strategic Plan II, 125-7). My web-based work in digital imaging began when my server - ArtServe - went live on 4 January 1994, with 2,000 images. Hence the same document notes the same requirement for the ANU as a whole (ibid., 273), and emphasizes (274) the role of electronic publishing: The ArtServe W3 server has won international acclaim for making thousands of images available across the Internet and for pioneering the delivery of visual learning material to students (ibid., 122). ArtServe now contains over 120,000 images, and receives an average approaching 500,000 "hits" per week. It forms an important platform for my teaching, all the lecturing elements of which are done using digital images over the web or from a CDROM. The Borobudur Project features in two units in the current semester.

03. Building the Virtual Visit and Multimedia Guide

The first task was to establish an image database of the sculptures, reliefs and views of Borobudur, and to present them as an indexed sequence of HTML pages, with thumbnails, record fields and hotlinks to the larger images:


We then produced a large (1.3Mb) CAD (Computer-Assisted Drawing) version of the stupa, with measurements, and used this as the basis for the construction (by Dr. Ajay Limaye, of our Supercomputer Facility Visualization Laboratory) of a VRML model.

The Virtual Reality Modelling Language is easier for the user to manipulate than a CAD drawing; it can accept "textures" which in our case are the monument's reliefs; and can also be programmed to offer "automatic" tours around one or more of the galleries. The VRML presentation may be slowed down, stopped, reversed; the user may zoom in or out, and indeed bring up both large images of the reliefs and (where appropriate) an account of the stories they relate on a separate HTML page.

From the beginning, the aim was to make the project available over the web as well as on CDROM - but it was clear from an early stage that this VRML model was much larger than the majority to be found on the web, even as a structural skeleton without the reliefs. As a unitary model, "clothed" with its texture maps (i.e. the reliefs), it would be too big for most machines to handle - although speed and throughput were bound to improve with time. So how to modulate the enormous quantities of data involved? Each relief is available on disk at 1.6 megapixels, and even a coarse-grain VRML model of the whole stupa would take half a minute to load, even over an ethernet connection:


The answer at which we arrived was to make the model and its reliefs available section by section. Tours of the whole complex are available, but detailed tours with the reliefs are sliced into galleries, and each gallery into quarters. Each of these tours (cf. for details) is then available in three speeds and with three qualities of relief image: for fast connections, the images are of 512 pixels; for medium speed, of 256 pixels; and for a slow connection, 128 pixels. 512 pixels are not very many, but yet provide a reasonable introduction to the reliefs:


When network speeds improve, it will be a small matter to increase the speed and hence the resolution of each image to one megapixel. Again, users could choose to load just one wall of one quarter-gallery, or both upper and lower walls. The alternative of using CDROMs instead of the web is also available - although the full-size images for all the reliefs would on their own fill three CDROMs. With the conformation described above, of 128-256-512 pixels, the whole project just fits on one CDROM.

Navigation around such a large project could also be a problem, so we have provided a translucent plan of the site, with a red dot which moves to indicate the exact location (left-hand image) and a second overlay which allows the user to change to another part of the stupa, and to change image resolution (right-hand image):


Finally, VRML allows another memory- and network-saving tool, namely the loading of images as the user approaches, and the dropping of frames once they are "behind" the user. This can be seen here:


04. Integrating Text and Image Database

One area of great interest to anyone wishing to illustrate HTML text pages is just how to integrate text and images from a database without hand-building each particular example, text, image and perhaps table-frame. One answer we have pursued is to prepare a text with "catches" recognizable to a perl script which, running through the HTML textfile, translates each catch into its equivalent caption-and-image from the database. The nature of the catches is not critical, so long as they are sufficiently unique to be written to a "control" file - a kind of menu for the perl program. The experimental results for an account of dress on the Borobudur reliefs can be seen here:


The perl script fills in what it can find, wrapping whatever database fields are required in a table, and hotspotting the thumbnail so that the user may access the large JPEG behind it. Where no reference is found, this is indicated.

Such a procedure - like several of the others implemented for The Borobudur Project - seems of great potential in any discipline where a dry text needs to be enlivened by images - which is, after all, one of the main reasons for using HTML and the web rather than resorting to the printed page.

.05. Beyond the flat computer monitor: Borobudur and "Being There"

While there is no substitute for visiting the site - any such site - the Borobudur VRML model can approach verisimilitude more closely than just the window provided by a computer monitor, which is little advance on the Renaissance paradigm of a painting as a "window" on the world. Too make such a project more lifelike, we must resort to cinema-like projection facilities. Thus the Wedge at our SuperComputer Facility Visualization Lab is a room with two large screens (each about 3 metres long by 2.2 metres high) set at right-angles, onto which are back-projected split computer images which are viewed in stereo through glasses. For the control offered by the VRML "dash board" on a flat computer monitor is substituted a Head-Mounted-Display (for turning left or right, up or downt) and a hand-held wand for moving forward or drawing back. The users (the space enclosed by the screens is large enought to contain ten or so) have the impression of standing within a stupa terrace or gallery at Borobudur - and they stand still (apart from head-movements) and move the environment for themselves. That the building system for Borobudur is suitably generic is shown by the incorporation of a part of the Parthenon Frieze in the same technology used for Borobudur. Note that the images seen here will appear blurred to us, because they are intended to be viewed through stereo glasses:


Presumably back-projection, with its extravagant space requirements, will be temporary: we might look forward to TFT computer screens available off the roll, and used like wallpaper, to replace project screens in lecture theatres and perhaps TV monitors at home.

.06. Building Complicated VRML projects

Although Borobudur is an intricate model teeming with images, it is a regular structure, with much the same relationship between reliefs and backing wall in every gallery. Only the stupa terrace is a little different. Consequently, the VRML model was built by hand. Another powerful reason for hand-building was that programs capable of building such a large object from photographs were not available when we started the project in 1998. Now it is indeed possible to construct VRML models from suitable series of photographs. Several programs on the market - such as Canoma, ShapeCapture and PhotoModeler - give the user the ability to use elements in photographs as VRML textures, and to build three-dimensional models. Both Canoma and PhotoModeler require different views of the same object, and the user then indicates to the program the identical points in the images, from which the program then constructs a three-dimensional model that can be exported as a VRML file. Simple cubic structures work well, but curves/domes/profiled pediments do not. Thus Le Corbusier's Villa Savoie at Poissy could be attempted, but Baroque churches - like Borobudur with its elaborate Buddha niches - seem too intricate for speedy construction. Whether ImageModeler (released 15 February 2000) offers an advance in speed and ease of use (not to mention verisimilitude) remains to be seen. For any such program, however, large quantities of images are required; a lengthy on-the-spot photographic campaign would be required, probably with alignment "targets" fixed to the architecture; some software depands on sophisticated measurement and cognizance of camera focal lengths - and I can find no large-scale VRML models produced with any of the software named above that approaches the complexity required by the real world of architecture (curves, domes, entablatures, statues).

For sculpture, the same restrictions apply: bas-relief sculpture is susceptible to three-dimensional modelling, so long as it sticks fairly closely to the back-plane. Classical sarcophagi provide a good example of how a three-dimensional object can be constructed from two-dimensional images, because they are simply a cube in lowish relief. First, photograph the front of the sarcophagus, and then photograph the left- and the right-flanking sides:


These three photographs are then "glued" by software onto a three-dimensional box, which can be rotated and examined within the program:


The resultant "three-dimensional" sarcophagus can then be exported into VRML (Virtual Reality Modelling Language) code, allowing it to be manipulated, rotated, panned and zoomed over the web within a viewing window:


Quite impressive from a distance, zooming in to such a VRML sarcophagus demonstrates that the user is manipulating four flat, albeit textured, sides.

In those cases where the sarcophagus has a back as well as two sides (such as the huge porphyry sarcophagi in the Vatican), then this technique can be very effective. But the more elaborate the sarcophagus - say, the Ludovisi Sarcophagus in Palazzo Altemps, Rome - the less successful such "flat-face" technologies can be. For three dimensional sculpture - for example, Bernini's Baldacchino in S, Peter's, or his Cathedra Petri - no photo-to-VRML software will work as yet.

.07. Museums, Panoramas and Virtual Space

Given that museums are by definition artificial environments of works extracted (for whatever reason) from their context, it might follow that they should become enthusiastic users of visualisation techniques. Indeed, their problems are exacerbated - and the need for computer visualization enhanced - by the fact that objects from one context can be spread around the world (e.g. the Parthenon frieze, the Pergamum Altar, several of the great collections of artworks formed in the Renaissance, or any loan exhibition anywhere). If VRML can be too labour-intensive and time-consuming for many applications, then technologies such as QuickTime and related panorama- and wide-angle-producing software can help. Computer panoramas are the direct descendant of full-size panoramic viewing galleries popular all over Europe from the Enlightenment onwards, such as the Eidophousikon - but with the distinction that they can be multi-layered, with hotspots which allow movement at the behest of the user, and the display of sets of images or text-pages "behind" the hotspots.

Panoramas may be "continuous", through 360 degrees, or just some segment of wrap-around space. As with VRML, the image can be panned and zoomed, and hotspots inserted which move the user to separate HTML links or pages. Very high-quality zooming is possible, but the technology entails the intricacy of fitting panels representing high-quality images within the panorama itself, thereby effectively offering a panorama with two different resolutions - a low one for the panorama itself, and a higher one for the artworks. The disadvantage (beyond the fact that this works well only for rectangular objects) is the amount of computer memory etc required to load and manipulate the panorama, but the finished effect is a good one. The user can approach the target object, and simply keep on zooming. In the example below, from the Dobell Exhibition at the Drill Hall Gallery of the Australian National University (September 1999), the large painting to the left of the first image is approached in the central image, then zoomed into in the right-hand image. Only at a relatively high-resolution does the zoomed image begin to break up - whilst the frame of the painting (at the lower, "panorama" resolution) would be very pixellated:


Of course, making such a digital record of the Drill Hall (or any gallery or museum) suggests that the space provided is somehow ideal: we record the space, when we should perhaps be concentrating on the works displayed. Assuming no display space is ideal, and that every exhibition requires different display parameters, it may be argued that the construction of a purely digital space can provide a better record of a physical exhibition than the collection of walls, windows, glare, shadows, glass and shiny floors that bedevil accurate photography in many museum spaces.

.08. The Virtual Museum as the Ideal Museum

Given the difficulties faced by many physical museums - lack of display space; 90% of items in store; distant from the majority of any persons who might want to visit; with only some of the items available that are needed to build the "context" of an art work), perhaps what we need for the Web is an Ideal Digital Gallery - a piece of purely digital real estate. This would be available 24 hours per day, with all information downloadable, with no glare, great light on all objects, catalogue immediately available, and no other visitors getting in the way of our viewing! Such a gallery could be of any scale, its walls and spaces of any colour or shape, with each curator able to call up ideal conditions suited to the different kinds of works to be displayed. "Hanging the exhibition" - i.e. positioning all the materials that make up the display - would be done at the computer screen, preferably using a Web browser, as would sizing the exhibition space, painting the walls, and priming the inevitable Virtual Coffee Machine.

Given limited resources, should we have virtual museums instead of physical museums, converting the latter to storerooms for specialists? For museums and galleries to argue against the bones of such a proposal might be seen by some as sheer luddism - that is, as a desire to retain "physical" work in the face of yet more mechanisation. The Luddites correctly perceived a threat to their livelihoods when they saw the spread of farm machinery; and museums and galleries might be advised to assess the threat or the opportunities provided by the web to avoid being ploughed under in their turn. To promote virtual museums is not, of course, an attack on visiting real sites - on cultural tourism, one of the treasured institutions of our culture, but rather the promotion of ways of reintegrating works in their context, bypassing the piecemeal approach perforce found in museum displays. Of course, seeing the actual objects cannot be beaten, but we should acknowledge the pressures militating against both visiting sites and viewing travelling exhibitions (the two sides of the same mirror, as it were). Some sites are not easy to get too; others (such as the Acropolis at Athens) are so heavily tramped that sections are roped off. Insurance and conservation problems sometimes militate against travelling exhibitions of great works today, as fewer borrowing institutions can afford to insure, and fewer lending institutions want gaping holes amongst their treasures. Only the blockbuster, with the look, feel and catchment of Disneyland, survives. The ideal digital exhibition, on the other hand, can offer all the works required to everyone, without cheapening the presentation into a fairyland.

.09. Conclusion: Where do we go from here?

In a paper on the Catal Huyuk project, Dr Ian Hodder hails the project's Web Site as by far the most effective way of transmitting information about the archaeological research, suggesting also that the potential for participation is such that it does become possible to talk of the erosion of hierarchical systems of archaeological knowledge and the emergence of a different model based on networks and flows.

This paradigm tranfers effortlessly to the realm of Art History, museums and art galleries, with the fleeting nature of temporary exhibitions being the equivalent of distant archaeological sites, difficult of access even if still exposed to view. Dr Hodder continues: Visual representation plays an important role in the construction of archaeological reality. It is often the primary means by which a site is presented to the world beyond the site itself. Museums, popular books and magazines, as well as specialist archaeological texts and journals all depend on visual images of sites to present them to their audiences.

In the same way, the disciplines of Art History and Museology must address the growing importance and flexibility of digital multimedia as aids in teaching, learning and presentation if they are to get away from the paradigm imposed on the one hand by the restrictions of traditional "lantern slide" technology and on the other by the cultural baggage of traditional perceptions of what museums are and should be. Hodder's different model is one which is more dynamic (because it can grow), and more congenial because it recognises the past as something that must be constructed, and that may best be constructed in the ideal - namely as virtual reality. In other words, there is little reason why we should stick with 35mm slides which, after all, have had a relatively short life (from the early 1960s). I already lecture projecting digital images from my website onto a screen in the lecture theatre; and there seems little reason why The Wedge as discussed above should not make its appearance in the lecture theatre, to add verisimilitude to learning.

In our new virtual world, what place will museums and galleries play? Are they simply storehouses for imperial or nationalist trophies? Or are they able to adapt and somehow integrate their objects with that virtual world? And if "physical tourism" continues to increase, the same question will be posed about important assests such as World Heritage sites for which UNESCO's aim is protecting natural and cultural properties of outstanding universal value against the threat of damage in a rapidly developing world. VRML and panoramas are but two of the developing technologies that can help us protect what we have, and it seems likely that other will appear, turning the web from a narrow "window on the world" into a vehicle for more immersive and user-controlled experiences, no matter what the subject-matter.

Michael Greenhalgh

The Sir William Dobell Professor of Art History
The Australian National University