Page  00000001 BRASS: Visualizing Scores for Assisting Music Learning Fumiko Watanabe, Rumi Hiragat, Issei Fujishiro Graduate School of Humanities and Sciences, Ochanomizu University t Department of Information Sciences, Bunkyo University E-mail: fumiko @imv.is.ocha.ac.jp, rhiraga@shonan.bunkyo.ac.jp, fuji@is.ocha.ac.jp Abstract We propose a system, called BRASS (BRowsing and Administration of Sound Sources), which provides an interactive digital score environment for assisting the users browse and explore the global structure of music in a flexible manner When making cooperative performances, it is important to learn the global structure to deepen understanding of the piece. The score visualization of our system can show the entire piece in a computer window, however long the piece and no matter how many parts it includes. The users can insert comments or links on this score to note down their understanding. A particular focus is placed on the conceptual design of spatial substrate and properties of the environment and related level-of-detail (LoD) operations with some functions. A user evaluation of the prototype is also included. 1 Introduction New technologies related to computer music make it possible not only to create new types of music by ourselves, but also to learn conventional music easily. We have been developing a system, called BRASS (BRowsing and Administration of Sound Sources), for assisting music learning. Because the players or conductors usually use sound sources and scores to learn music, BRASS provides new interfaces for administration of sound sources and for browsing musical scores. The targeted users of BRASS thus include players or conductors who want to improve their understanding of a musical piece, so not music novices. Although the system can be used by a single user such as a pianist, it will be more beneficial to the users who are going to perform a musical piece together. We have focused our study on musical scores for cooperative performances, especially on score visualization for browsing (Hiraga, Watanabe, and Fujishiro 2002). Digital scores can be edited more easily than handwriting, for we can handle them interactively. The advantage has encouraged development of many common music notation editors. Although the editors are now widely used, they still have some problems from the user interface point of view. We found that the users are unable to see an en tire score at once on the existing software. Even with the digitized score generated by Finale, searching for specific measures is accomplished either by specifying the measure number or scrolling down a window, while the printed scores are searched for specific measures by trial and error. Furthermore, a full score of orchestra has many parts. If all parts are displayed in a single window, it becomes hard to read the score. On the other hand, if the score is magnified, it becomes impossible to grasp all parts at once. For example, consider a case that the users find out the portions of leitmotiv from the entire score of the music drama of Wagner, such as "Die Walkiire". Since "Walkiirenritt" of "Die Walkiire" includes 22 parts and 164 measure, the entire music drama is very large. Therefore, finding out a portion is very inefficient whichever score they use. They cannot devote their attention to music learning because the interfaces impose heavy psychological loads on them. If the users browse a score effectively, they can find a portion to pay attention easily and grasp many parts at once. As we have described, score browsing is important not only for score editing but also for music learning. We currently focus on supporting score browsing for music learning in BRASS. In our system, score browsing is realized by using level-of-detail (LoD) operations. One of the features of our score browsing is to enable the users to use more of the display resource to correspond to interest of the user's attention - called focus+context (Card, Mackinlay, and Shneiderman 1999). The score visualization of our system can show the entire piece in a computer window, however long the piece and no matter how many parts it includes, as well as selected part. In Section 2, we describe related work in the areas of music visualization and information visualization. In Section 3, we describe the conceptual design of the spatial substrate and visual properties of a score. In Section 4, we describe LoD operations and related functions of our system. Then in Section 5, we show a sample visualization results obtained by using our prototype system. In Section 6, a user evaluation on the prototype's performance is given. Finally in Section 7, we summarize the key points and describe our future work.

Page  00000002 2 Related Work The interface of BRASS is related to music visualization and information visualization. With the aim of facilitating music learning, we focused on ways to visualize a score and its performance, the accessibility of information, and the usability of the system. 2.1 Music Visualization "Music visualization" which we intend is simply visualizing music using computer graphics and not for artistic purposes. Visualization of sound is already indispensable in research and applications such as sound analysis and synthesis. Besides showing that sound as a wave shape, Pickover showed an interesting and impressive symmetrical figure of sound data (Pickover 1980). Sobieczky visualized the consonance of a chord on a diagram based on a roughness curve (Sobieczky 1996). There have been some studies on music visualization, but not very much. We have two types of music visualization. * Augmented score visualization Conventional staff notation is limited because it does not necessarily represent all of composers' expressive intentions. Oppenheim proposed a tool for representing composers' expressive intentions (Oppenheim 1992). Kunze proposed defining several figures in three dimensions for composers to use in composing their works (Kunze and Taube 1996). * Performance visualization Hiraga proposed using simple figures to help the users analyze musical performances (Hiraga, Igarashi, and Matsuura 1996). Hiraga also proposed using Chernoff faces to visualize performance expression (Hiraga 2002). Smith proposed a mapping from MIDI parameters to 3D graphics (Smith and Williams 1997). Foote's checkerboard-type figure (Foote 1999) shows the resemblance among performed notes based on the data of a musical performance. Watanabe proposed a system with a unified 3D interactive interface both for browsing and editing sound data (Watanabe and Fujishiro 2001). Miyazaki's comp-i tool with a 3D performance visualization interface enables users to generate music using a rich set of functions (Miyazaki and Fujishiro 2002) (Hiraga, Miyazaki, and Fujishiro 2002) (Miyazaki, Hiraga, and Fujishiro 2003). Dixon proposed using worm-like figures to visualize tempo and dynamics of a performance (Dixon, Goebl, and Widmer 2002). Our system provides an interface for an augmented score (though not for composers), and is related to per formance visualization deeply in terms of music learning. 2.2 Information Visualization We benefited from the research on information visualization in our efforts to visualize scores effectively. There are several ways of viewing a particular data from a large quantity of data; Fisheye views (Sarkar and Brown 1992), Perspective Wall (Mackinlay, Robertson, and Card 1991), LensBar (Masui 1998) and Table Lens (Rao and Card 1994) are able to generate a small display of a large structure and enable users to use a focus+context function to display a particular part while maintaining the relationship of the part to the whole. Our system also use a focus+context function to enable the users to browse a score effectively. 3 Conceptual Model of Scores In this section, we design the conceptual model of scores to visualize them effectively. A score is shown on the 2D space with time and part axes, where there are objects to describe the music. A score can be modeled conceptually with spatial substrate and visual properties (Card, Mackinlay, and Shneiderman 1999). 3.1 Spatial Substrate The spatial substrate is the framework for spatial information representing a musical structure. When describing music, we need notes, staves, vertical bars, and brief marks (e.g., the D.C.), which are the elements of spatial substrate. The spatial substrate consists of the objects as follows: * For specifying the pitch of sound: Staves, ledger lines, clefs. * For specifying the rhythm: Time signatures, bars. * For specifying the tonality: Key signatures. * For representing the sound: Notes, accidentals, rests, ornaments. * For omission: Repeats, D.C., etc. 3.2 Visual Properties Although the elements of the spatial substrate specify what sound and when the sound is played, they do not specify how the sound is played. Not only the elements of the spatial substrate but also the performance indicators are described on the score, such as for dynamics and signs of glissando, tenuto, or fermata. We define

Page  00000003 the indicators as the visual properties. Note that the properties are not necessarily on the score because they represent the additional information. The properties can be classified according to the range of time and part applied, and the ability of being ordered, as shown in Table 1. For example, the forte sign, which shows dynamics, is the property that is applied to a phrase and a part and can be ordered. Table 1: Classification of the properties (a) Classification A note A phrase A piece A part Local dynamics Dynamics* Instruments Method 1* Expression Method 2 Method 4 All parts Method 3 Tempo* Title Composer *The properties which can be ordered (b) Classification of the method for playing Method 1 staccato, marcato, tenuto Method 2 glissando, arpeggio, tremolo Method 3 fermata Method 4 legato, specifications for each instruments 4 Proposed System We make the effective visualization of an entire score through the LoD (level-of-detail) operations designed in accordance with the conceptual model shown in the previous section. In this section, we show the method for LoD control of a score across time and part-axes, and some related functions. 4.1 LoD Control Across the Time-Axis The compressed representation of the score enables to show the entire score in a single window. Therefore the users can recognize the entire score at once. The way we compress the substrate can be explained as follows: * A staff is represented as a line with width. * Clefs are shown at the beginning of the score. * Symbols for rhythm and tonality are shown at the beginning and changing points. * Each note is not shown. The number of notes in a measure defines the brightness of the line. As an option, the pitch change of notes can be shown by a line plot. The way we compress the properties can be explained as follows: * Dynamics are shown on the width of each line. * Tempo is shown on the background color. * Fermata, legato, and the other specifications are not shown on the figure. We will use particular glyphs (Keller and Keller 1992) for them. Glyphs are objects or symbols for representing data values. * Specification of the instruments is shown at the beginning of the score. Because the properties represent the additional information, they are shown as options in the compressed score, except for dynamics. The properties which the user needs can be shown selectively. If a portion is selected on the compressed score for closer examination, the portion is magnified as a usual score in the same window. Since the neighbors of the selected portion include the next most important information, they are visualized with less compression. This cornice-metaphor for focus+context visualization lightens the user's psychological load imposed by visual perception and system operation, so the users can devote their own attention to the music itself. 4.2 LoD Control Across the Part-Axis The continuity across the time-axis heightens the effects of the focus+context visualization. However, because of the discrete arrangement of the part-axis, the focus+context visualization is not effective along this axis. Therefore, each part is shown equally in the window by default. And then, the user can select the LoD of the part, which they magnify or hide. This method enables the users to see only the part to pay attention. 4.3 Related Functions We propose some related functions to further assist music learning. Comment Insertion. The users can insert comments in the visualized scores. The comments are stored in a file, so the users can exchange their comments each other. We have two types of comments. * Text-based comments The user can describe the local information of the piece by using text. Text-based comments are shown as glyphs in the compressed score. The user can read comments from all the users on a focused portion. * Link-based comments The users can describe the global structure of the piece by using explicit links. The links are shown as lines in the visualized score.

Page  00000004 Playing in Synchronization with the Score. The users can play the piece from any focus portion. Played music synchronizes with scores horizontally or vertically. When the piece is played, the focus moves along with the performance horizontally. The magnified part is played with loud volume, and the hidden part is not played. Storing the LoD State. The system stores the LoD state, so the user can open the score with the last LoD state with the information of focused portion and the part of magnified or hidden. Therefore, the users could restart from the same state smoothly if they suspended score reading. 5 Example The code is written in C++ with OpenGL. The system accepts three file formats: ETF format of Finale for score visualization, SMF format for playback, and its own text data for storing comments. Using a sample score and the first movement, consisting of 197 measures, of "Clarinet Quintet A-major, K. V. 581" by W. A. Mozart as example, we will explain the functions of our system. 5.1 Visualizing a sample score Figure l(a) shows the original version of the sample score. Figure 1 (b) shows the entire piece with compression. Since this score consist of one part, the visualization includes a single line with width. The width of the line represents the dynamics: the stronger portions the thicker. The brightness of the line represents the number of notes in a measure: the more notes the darker. Figure 1 (c) shows the melodic line. Figure 1 (d) shows the tempo as the background color. The background color changes from blue to red as tempo becomes quicker. If a portion is selected for closer examination, it is magnified in the same window (Figure 1 (e)). In this case, the seventh and the eighth measures are selected. Previous and subsequent measures are less magnified. 5.2 Visualizing "Clarinet Quintet" by W. A. Mozart The user interface of BRASS provides the full score in a single window. Figure 2 shows the entire score of the movement. Some of the vertical bars, such as repeat and the double bar, are shown in different colors on the figure since they show musical structures. Though it is not clear in the figure, there are repeats at the end of the exposition (the eightieth measure); they are in red. The users can control the LoD either vertically or horizontally. The user can magnify or hide a part. Clarinet is magnified (Figure 3(a)). This is the vertical LoD Lento Largo Adagio Andante Moderato...,I i i I,i rwi Allegro Vivac pp p Presto MP Mf AlI I I I I I Ilmp I I I - w I II I f f Sp (a) Sample score (b) Overview (c) Overview with melodic line (d) Overview with tempo Mf f ff f ff p (e) Focus+context Figure 1: Example - sample score control. If a portion is selected for closer examination, it is magnified in the same window in which the uncompressed score for the 83rd and the 84th measures appears (Figure 3 (b)). The focus portion is the development of the first movement in which the clarinetist plays the first motif in the exposition. This is the horizontal LoD control. Options of tempo indicators (Figure 4 (a)) or melodic lines (Figure 4 (b)) can be shown on this score. Since there is no tempo change indicated on the score, the score is painted in one color. Figure 5 (a) shows the score with inserted comments of text as triangles. When a position is selected, any comments at that position become readable (Figure 5 (b)). The player of the violoncello's comment is shown. Figure 6 shows the score with inserted links. 6 User Evaluation We have made an initial test of our prototype system using five users who are students of Ochanomizu University. The subjects consist of two strings and three winds, and have been playing their instruments for four to ten years. Three of them are studying information sciences, so they are familiar with computer. They study the first movement of "Clarinet Quintet" mentioned above as the example using BRASS. We explain to the subjects how to use the system and pass the manual to them. Though the time to use the system is not restricted, the time the subjects actually used the system was from one hour to two hours.

Page  00000005 C or in st i ri Wioli i A Cello ~ ~ -B~BI I::1 X-0j 10 asi~sa...... AN,'................... I l l Figure 2: Example - "Clarinet Quintet": Overview........................................................................................................... violin I 0C ~N~. Celolq IIIp IIk p'1i~~ (a) Vertical LoD: magnification of Clarinet.......................................................................................................................................... Clarinel irA AMO j ~ ~ Ij Celloo 83 8 ITTI 6 1 - f i ll ~III II181 (b) Horizontal LoD: focus+context (83-84 measures) Figure 3: LoD Control

Page  00000006 ---------------;::: 41 i A (a) With tempo (b) With melodic lines Figure 4: Options Perk I l ~II -1 II i i 2 iii:l I ~liil IiB I~ (b) Fou~otx 1 1 hl 11 hl 1 Ih~ 4G, -.........-..............a... t XX:P;:~ ~ r*Tmo*,.,.:-:oi~.,Aku (b) Focus+context (a) Overview Figure 5: Text comments........................................................... ii. Illl 1 I:ii] jI 9i~L Figure 6: Links

Page  00000007 Table 2: Questions and Answers Questions Evaluation Time-Axis Is it easy to understand the compressed expression? 3.8 Is it easier to grasp the entire music using BRASS than using usual scores? 3.8 Is it easier to find a portion to pay attention using BRASS than using usual score? 5.0 Is focus+context visualization useful? 4.6 Is it easy to see the dynamics displayed as a default? 3.8 Is it easy to imagine tempo from the background color? 3.6 Is it easy to choose options? 4.2 Part-Axis Is the LoD control across the part-axis useful? 4.8 Is it easy to control the LoD using mouse? 4.8 Related Functions Do comments and links support your music learning? 4.6 Is the synchronization of playing with the score useful? 5.0 Is it useful to store the LoD State? 5.0 5 4 3 2 1 YES <-neutral ->NO Table 2 shows the questions and answers that the subjects filled in after using BRASS. We received almost high evaluation marks. From the result, what we find are summarized as follows: * Especially, "find a portion to pay attention easily", "synchronization of playing with the score" and "store the LoD state" are highly evaluated. * Representation of tempo as the background color received divided evaluation. One is difficult to imagine tempo from the background color. Another easily understands tempo change from change in the background color. * It is more difficult to choose options for the users who are not familiar with computer than the users who are familiar with it. We also asked the subjects which is the most effective or ineffective among followings: 1. To grasp the global structure of the piece easily. 2. To find a portion to pay attention easily. 3. To deepen understanding of the piece. 4. To understand the own or another interpretation of the piece. To which is the most effective, three answered 2, one answered 1, and the other answered 3. To which is the most ineffective, one answered 3, and the rest 4. We consider that the effect on understanding of the own or another interpretation was not accepted because there was no scene of actually exchanging comments through the Internet. We got free comments as follows: * Bigger portion to focus is desirable. * It may be more difficult to see the score when displaying a score with many parts. * Input of comments graphically will be convenient. * BRASS will be useful for editing scores. We will take these comments into consideration as we work to enhance the system. 7 Conclusion We have described our prototype visualization system for learning music. The system provides the users with a new way of accessing musical information related to scores. No matter how long a musical piece, the entire piece is shown in a window, giving the users better access to portions of interest. We plan to make the following improvements. * It is natural to compare two or more portions since a musical piece often involves repeated phrases. To make this possible with our system, we have to make two or more portions possible to show at once. * It is important to understand performance expression for learning music. We will show performance visualization on the score to make the users better understand the musical piece. * For cooperative performers, such as members of orchestra, we will enable the users to exchange their opinions at any time and any place using our system.

Page  00000008 References Card, S., J. Mackinlay, and B. Shneiderman (1999). Readings in Information Visualization, using vision to think. Morgan Kaufmann. Dixon, S., W. Goebl, and G. Widmer (2002, September). The Performance Worm: Real Time Visualisation of Expression Based on Langner's Tempo-Loudness Animation. In Proc. the International Computer Music Conference, pp. 361-364. International Computer Music Association. Foote, J. (1999, October). Visualizing Music and Audio Using Self-similarity. In Proc. ACM International Conference on Multimedia (Part 1), pp. 77-80. ACM. Hiraga, R. (2002, October-November). Case Study: A Look of Performance Expression. In Proc. IEEE Visualization 2002, pp. 501-504. IEEE. Hiraga, R., S. Igarashi, and Y. Matsuura (1996). Visualized Music Expression in an Object-oriented Environment. In Proc. the International Computer Music Conference, pp. 483-486. International Computer Music Association. Hiraga, R., R. Miyazaki, and I. Fujishiro (2002, December). Performance Visualization - A New Challenge to Music Through Visualization. In Proc. ACM International Conference on Multimedia, pp. 239-242. ACM. Hiraga, R., E Watanabe, and I. Fujishiro (2002, December). Music Learning through Visualization. In Proc. International Conference on Web Delivering of Music (Wedelmusic2002), pp. 101-108. IEEE. Keller, P. R. and M. M. Keller (1992). Visual Cues: Practical Data Visualization. IEE CS Press. Kunze, T. and H. Taube (1996, August). See - A Structured Event Editor: Visualizing Compositional Data in Common Music. In Proc. the International Computer Music Conference, pp. 63-66. International Computer Music Association. Mackinlay, J. D., G. G. Robertson, and S. K. Card (1991, April). The Perspective Wall: Detail and Context Smoothly Integrated. In Proc. ACM SIGCHI Human Factors in Computer Systems Conference, pp. 173 -179. ACM. Masui, T. (1998, October). Lensbar - Visualization for Browsing and Filtering Large Lists of Data. In Proc. IEEE Symposium on Information Visualization, pp. 113-120. IEEE. Miyazaki, R. and I. Fujishiro (2002, October-November). Interactive Poster: 3D Visualization of MIDI Dataset. In Proc. IEEE Visualization 2002 Posters Compendium, pp. 96-97. IEEE. Miyazaki, R., R. Hiraga, and I. Fujishiro (2003, July). Exploring MIDI Dataset. To appear in ACM SIGGRAPH2003 Conference Abstracts and Applications. Oppenheim, D. (1992, November). Compositional Tools for Adding Expression to Music. In Proc. the International Computer Music Conference, pp. 223-226. International Computer Music Association. Pickover, C. (1980). On the Use of Computer Generated Symmetrized Dot-patterns for the Visual Characterization of Speech Waveforms and Other Sample Data. J. Acoust. Soc. Am. 80(3), 955-960. Rao, R. and S. K. Card (1994, April). The Table Lens: Merging Graphical and Symbolic Representations in an Interactive Focus+Context Visualization for Tabular Information. In Proc. ACM SIGCHI Human Factors in Computer Systems Conference, pp. 318-322. ACM. Sarkar, M. and M. H. Brown (1992, April). Graphical Fisheye Views of Graph. In Proc. ACM SIGCHI Human Factors in Computer Systems Conference, pp. 83-91. ACM. Smith, S. M. and G. N. Williams (1997, October). A Visualization of Music. In Proc. IEEE Visualization'97, pp. 83-91. ACM. Sobieczky, F. (1996). Visualization of Roughness in Musical Consonance. In Proc. IEEE Visualization'96. ACM. Watanabe, A. and I. Fujishiro (2001). tutti: A 3D Interactive Interface for Browsing and Editing Sound Data, pp. 33-38. Kindai Kagaku-sha.