Page  00000093 THE USABILITY OF MUSIC THEORY SOFTWARE: THE ANALYSIS OF TWELVE-TONE MUSIC AS A CASE STUDY Tuukka Ilomdki Sibelius Academy ABSTRACT Computer applications are an everyday tool for music analysts, composers, and music theory students. While these applications are a welcome tool to be used in the classrooms and research labs, their effectiveness could be improved by focusing on their usability. The usability of a user interface can be evaluated and even measured with respect to the goals of its users. In order to demonstrate the evaluation of a user interface, I present an experiment in which the efficiency of user interfaces is assessed in the context of three scenarios or ''use cases.1" Based on the experiment, I discuss some basic principles of usability theory, such as affordances, minimization of navigation, error handling, immediate feedback, and data visibility. The evaluation of these principles suggests some new types of music theory applications. 1. INTRODUCTION The popularity of personal computers has given impetus to the development of music theory software. It is now commonplace to employ computer applications for learning music theory fundamentals, generating row matrices, analyzing pitch-class sets and K-nets, and so on. While many of these applications undoubtedly serve their purpose, it is worthwhile to investigate how their usability could be improved. Instead of merely examining what these applications can do, I will here explore how they do it. Computer applications can be evaluated from a number of perspectives. In research contexts it is usually only required that the application produces the correct result in a reasonable time. If a broader audience is targeted, however, the user's experience depends not only on whether the application returns the right result, but also on the ease with which the user can interact with it. Virtually everybody agrees on the maxim (or truism) that a user-friendly user interface makes software more usable. There is, however, significantly less agreement on what constitutes a user-friendly user interface. For example, we should not be satisfied with a popular but trivial generalization that a graphical user interface is automatically better than a command line interface. Unfortunately, a user-friendly user interface seems more often to be an issue for the marketing department than for the research department. In this paper I argue, based on usability theory, against a deeply rooted myth that the quality of a user interface is a matter of opinion. In order to demonstrate the evaluation of a user interface, I will offer a hands-on experiment in which I present two user interfaces and measure the time that it takes to finish a given task. The reading of a stopwatch is certainly not a matter of opinion. According to Nielsen [4], "Clarifying the measurable aspects of usability is much better than aiming at a warm, fuzzy feeling of user friendliness." Furthermore, the evaluation of some of the principles of a good user interface suggests some appealing directions for music theory applications. 2. USABILITY THEORY Usability theory (or computer-human interaction) is a relatively young branch of computer science that has its roots in cognitive psychology. Even if it does not yet belong to mainstream computer science, the awareness of its importance seems to be growing. While some of the principles of usability theory are gradually making their way to the mainstream software applications, the underlying research is less known. In order to critically discuss usability we need to define what we mean by it. According to The International Organization for Standardization usability is "the extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency and satisfaction in a specified context of use" (ISO 9241-11). The crux of this definition is that it does not define usability per se; it only defines usability with respect to the goals of the users. Hence, in order to design a good user interface and evaluate its usability we must first discover the goals of the potential users. With respect to usability, the "features" of a computer application are insignificant. The only thing that matters is how efficiently the users can achieve their goals. (This does not mean, however, that some user interface designs would not be terrible independently of the user's goals.) Nielsen [4] prefers the term "usability" to the vague umbrella "user friendliness." He divides usability into five categories: learnability, efficiency, memorability, error handling, and user satisfaction. The relative importance of these depends on the type of application - we can accept a steeper learning curve on software controlling a nuclear power plant, than on an application that prints a row matrix. Each of these categories can be evaluated. In the following I will focus mostly on learnability, efficiency, and error handling. Learnability denotes the ease with which a first time user can interact with the user interface, and efficiency denotes the 93

Page  00000094 streamline of the process when the user has mastered the application. Errors are, of course, one of the major factors behind poor learnability and efficiency. Since usability is defined in terms of the users achieving their goals, the focus here is in the interaction between the user and the application. Even if the elements of the user interface can be improved to some degree, a good user interface for a badly designed process does not eliminate the problem that the process is badly designed. For instance, efficiently designed buttons are cold comfort if we could manage without the buttons in the first place. Hence, Cooper [1] calls for interaction design instead of interface design. Typically, user interfaces are not designed; they simply evolve, being almost an epiphenomenon to the necessary data structures and algorithms. Giving priority to the usability means that an efficient use process is designed first - without letting the programming chores restrict the design. In particular, the user interface should not be modeled after the implementation in code. We encounter poorly designed software frequently and are familiar with some bad designs (and might even venture to defend some of them since we are used to them). The "problem with save" is an exemplary case in point: the save function exists only because two memory types are used in computers (volatile and non-volatile). According to Cooper [2], the need to save files "is a result of the programmer inflicting the implementation model of the disk file system on the hapless user." 3. PERSONAS, GOALS, AND TESTING It is not possible to predict all goals of all potential users. Hence, we need to confine ourselves to examining the goals of a few exemplary (but not average!) users. Cooper [1, 2] offers, as a tool, a set of prototypes of real people that he calls "personas." According to him, a persona "encapsulates a distinct set of usage patterns, behavior patterns regarding the use of a particular product." The personas have specified backgrounds and goals: what they want to accomplish and why. In order to test how the personas (representing the potential users of the software) can achieve their goals, we need to design a set of "use cases." A use case is a description of a persona with some specified background and goals using the software to perform a task: the usability of the application can then be tested by simulating the actions of that persona. In practice, a large number of use cases is not needed: the possible flaws of a user interface can be discovered even with a few ones. What is more important is embracing the idea that usability can be tested intersubjectively and that the issues (such as a wrong click) the users might encounter indicate problems with the application rather than with the user. There is a distinction between how users should and how they actually use the software. Consequently, the developer of the software is a poor tester (since the developer knows how the user should use the software). If we want to evaluate the learnability and (to a degree) efficiency of the software, we need to test with people who are not involved with the development. The aim is to streamline the process of achieving typical goals. The problem with 'save' is a good case in point: not saving the changes is a rare exception. Nevertheless, an extra and unnecessary step is required to accomplish the usual case. Saving the changes should be the default action done automatically; the exceptional case of not saving changes can require more effort. 4. A USER INTERFACE OF AN APPLICATION FOR THE ANALYSIS OF 12-TONE PIECES Let us begin by considering the following use case. Alice, a graduate music theory student, attends a course on 20th century music. She is writing her final paper on Arnold Schoenberg's Variations for Orchestra op. 31. She has analyzed the previous movements and is about to analyze the fifth variation. She knows the row of the piece and she has discovered that in this piece rows may be split to several voices but within the voices the pitch classes are generally in the correct order. She begins her analyses by deciphering the row forms. She has analyzed the previous movements with pen on paper and now tries a computer application. For the purpose of simulating usability testing, I have developed two java applets (available online at htt:iiwwwiki.fiituuka.oiak/i[Ci 2007/), both of which are intended to help Alice achieve her goal. The user interfaces of these applets are shown in Figures 1 and 2. In the following, I will refer to them simply as the bad and the better user interface. schoenee Va,-a-wos o n 31 now:i 0:8S8439A2 Sements: 98 AB S1 23 Figure 1. The bad user interface. Row |06857B439A12 Segments 198 Ai 1023 Found pc:s MissiNng ps: 0 12 3 9 1011 4 5 6 7 -I0 A42533L670 i9@:2 05: 9 2849 0 Figure 2. The better user interface. Both applications have the same functionality: instead of jotting down the row matrix and finding the segments of the musical surface in it, Alice can enter the row of the piece and the segments, and the application tells her which rows in the pertinent row class have those segments. We can consider the following points in our evaluation: What is the number of steps Alice needs 'I borrowed the names of the three protagonists of my uses cases from cryptography in which they are used frequently. 94

Page  00000095 to take? How does the program aid deciphering row forms in the "normal" cases? How does the program cope with incorrect input? How easy is it to spot correct but erroneous input? We should notice that one of the applications does a considerably better job. Designing good use cases is far from trivial. The above use case was tailored to provide a starting point for comparing the two user interfaces and therefore simplified. It is a poorly designed use case, however. Is Alice's goal really to decipher the row forms? Certainly not. A mere deciphering of the row forms is seldom (if ever) the goal. Could finishing the paper be Alice's goal? In that case, an optimal user interface would be one in which pressing a big red button prints out a finished top-grade final paper. (Or, why should she even be bothered to press the button?) She rather wants to learn twelve-tone analysis and this piece well enough to be able to finish her paper. The application, however, only helps her in deciphering the row forms. The following use case depicts another real-life situation. It allows us to concretize some further usability problems in the two applications. Bob, an established scholar on twelve-tone music, has noticed that two published analyses of Arnold Schoenberg's Variations for Orchestra op. 31 give different row forms for the fifth variation. He has decided to examine the grounds for such a phenomenon and to find arguments for favoring one or the other interpretation. The two demo applications support Bob's goals poorly: they might speed up finding the potential row candidates in the non-controversial measures (such as the beginning of the variation), but not in the controversial ones (such as the second beat of measure 200). A test run with Bob would give rise to the following questions: How does the application support deciphering row forms in measures where the pitch classes are not in correct order? Does the application support the comparison of multiple interpretations of rows (like the opening row forms in the fifth variation) and in contradictory places (like many row forms in the fifth variation)? Does the application support making analytical observations, such as comparing various passages in the piece and finding recurring patterns or transformational relations? The above two demo applications might have initially seemed decent little utilities. The moral of the experiment is that well-designed use cases are required in order to critically assess the usability of a computer application. Alice and Bob have different needs (pedagogical versus research-oriented), but it would certainly be possible to design a user interface that satisfies both. In order to address the goals of a still wider audience, let us add one more use case. Carol, a graduate music theory student, is taking her first course on 20t century music. Her homework is to write an analysis of a brief twelvetone composition of an undisclosed composer. This use case adds a new need of functionality: since in order to proceed with the analysis, Carol needs help in discovering the row of the piece - which can sometimes be challenging and, since the composer is undisclosed, she cannot do a literature search). Upon discovering the row, she has the same needs as Alice. Find the row of the piece Dccipher tIe row tOrmrs Analyze the rows Row: 06857B439A12 Segment sets: Row candidates: 98 10234567 AB IO(AB89 24 13 6705) RP11(98 BA 76 314250) 56 12 A9OB 87 43 P4(AB 8913246705) R13(98 BA 317642 50) 54 98 AB01 23 67 16(AB 89241367 05) RP7(98 BA 763142 50) 45 01 98 BAI P3(AB 891324) RP2(B79541 82)RII(82 A063 B7)RI2(98 BA 3176) Figure 3. An improved user interface. The three use cases provide a base for developing a user interface of an application for the analysis of twelve-tone music. Figure 3 sketches a portion of a user interface for an application that is designed to support the goals of Alice, Bob, and Carol. The three phases that often occur in the process of analyzing a twelve-tone composition - finding the row of the piece, deciphering the row forms, and analyzing the rows - are placed in separate tabs. The first one is designed to aid Alice in discovering the row of a piece. Alice and Bob will start from the second tab. The data entered in one phase is naturally transferred to the next phase. From the usability perspective, the new interface has some enhancements compared to those in Figures 1 and 2. The segment sets are displayed in list form and they can be edited in place. This enables the analyst to keep track of the progress and compare passages. An optimal implementation of this list would allow the user to copy, move, insert, and delete segment sets directly - no buttons are needed. Naturally, the list of row candidates is updated automatically as in Figure 2. We must draw a limit on how far we must go to support the particularities of the use cases. In particular, with respect to the case study in this paper - the analysis of twelve-tone compositions - each composition presents its unique set of analytical challenges. Hence, the target is to develop application that supports the most frequently occurring issues. For instance, the pitch-class organization of the fifth variation is peculiar enough, in that instead of writing an application to decipher it one might as well write an article that explains it (see for example [3]). The target is to support the typical goals, not all possible goals. It is difficult to design a user interface that supports even one selected goal and scores high in all five of Nielsen's categories; designing a user interface that supports a large number of different goals is an unlikely achievement. Hence, it would be advisable to implement only a few prominent analytical methodologies in the analysis tab, such as transformational relations, hexachord contents, invariant subsegments and subsets, and so on. 95

Page  00000096 5. ELEMENTS OF A GOOD USER INTERFACE A user creates a mental model of an application based on the affordances it offers (see [5]). The more intuitive the necessary tools are and the more readily they are available, the more easily the user is able to use them.' Even if the applications in Figures 1 and 2 do not properly support Alice's and Bob's goals, they serve to demonstrate some principles of good user interfaces. First, all the affordances are visible in both applications - I could have made the bad interface considerably worse by hiding the affordances in menus instead of using buttons. Secondly, the minimization of navigation is one of the key principles of usability. A navigational element, such as a dialog box, always constitutes an extraneous mental effort as the user needs to adapt to the new environment. One of the major reasons why the better one of the two user interfaces in the experiment performs better is that all the data can be edited directly and no navigation is needed. As shown in Figure 2, this application contains no buttons, dialog boxes, or alerts. From the usability perspective, the buttons in the bad user interface are unnecessary (like most buttons with text "Go" or "Search") as they force the user to navigate between different windows. Figure 4. Error handling in the better user interface. Thirdly, errors (Nielsen's fourth usability category) are a major hindrance to achieving specified goals with effectiveness. The error handling strategy is divided into prevention and recovery. The former means that the user interface should guide the user to avoid errors; the latter means that catastrophic errors are not possible and recovery from minor errors should be straightforward. The error-handling strategy in the better user interface is that, in the case of malformed input, the user is simply shown the result of such an input. Humans are used to coping with incomplete and incorrect data. Hence, rather than breaking the flow of actions, the user is notified of malformed input - the application does not come to a halt and require that the user corrects the error immediately. Figure 4 shows how cyclic segments are handled in the better user interface: no row contains cyclic segments and, hence the result row is empty and an error message is shown in context. The flow of actions is not broken, however. The application also provides a simple option for undo: editing the cyclic segment (which most likely is a typo) in the input field. Fourthly, one of the key elements of a good application is that all data is visible all the time and the user can interact with the data in a straightforward 1Affordances are not limited to computer applications. For instance, a door can be pushed/pulled on the left/right side: if the affordances are hidden there is only 25% chance that it can be opened on the first trial. manner. As the user enters the input, the immediate feedback in the form of a live update of the data enables a far better understanding of the connection between the input and its effect on the data. This application supports in-place editing of input, not entering data piecemeal via a succession of dialog boxes. Finally, I wish to emphasize that the above considerations do not obliterate the normal accessibility principles, such as providing necessary guidance (even if the need for guidance might imply a problem in the user interface) or taking color vision deficiency into account. 6. CONCLUSIONS The above discussion suggests at least two types of music theory computer applications that could merit a place in the computer music community. The first is an application that truly supports the music analyst's goals. Such an application would support making analytical observations from multiple perspectives on both local and global levels. Twelve-tone rows were used as the example above, but other compositional practices could have been used as well. The second can be conceived of as data mining: the user examines the data - pitch-class sets, twelve-tone rows and K-nets, for instance - from different perspectives. The data could be either the entire universum or a subset derived from an analytical or a compositional context. In the above row analysis applications, for example, the data is the rows in a row class and the user examines the rows with respect to their subsegment content. The usability point here is that the actions of the user result in an immediate response, making explorations of the universum easy. Applying the idea of data mining, I would like to explore various universums of musical entities in real time, charting relations between the entities, experimenting with diverse constraints and investigating which entities satisfy my criteria, thus revealing new, interesting and perhaps surprising facets of that universum. My hope is that this paper would give some reader the impetus to create such applications. 7. REFERENCES [1] Cooper, A. The Inmates Are Running the Asylum. Sams, Indianapolis, IN, 1999. [2] Cooper, A. About Face 2.0: The Essentials of User Interface Design. Wiley, Indianapolis, IN, 2003. [3] Ilomaiki, T. "Aspects of pitch organization in Schinberg's Variations for Orchestra op. 31." Lithuanian musicology. Forthcoming. [4] Nielsen, J. Usability engineering. Professional, Boston, MA, 1993. AP [5] Norman, D. The psychology of everyday things. Basic Books, New York, NY, 1988. 96