Page  00000001 Extraction of Musical Performance Rules Using a Modified Algorithm of Multiple Regression Analysis Osamu ISHIKAWAt Yushi AONO t Haruhiro KATAYOSE tt Seiji INOKUCHIt t Faculty of Engineering Science, Osaka University 1-3 Machikaneyama-cho Toyonaka City, Osaka 560-8531, Japan tt NTT Cyber Space Laboratories 1-1 Hikarinooka Yokosuka City, Kanagawa 239-0847, Japan ttt Faculty of Systems Engineering, Wakayama University 930 Sakaedani, Wakayama 640, Japan E-mail: ishikawa@inolab.sys.es.osaka-u.ac.jp ABSTRACT This paper describes a music interpretation system for traditional tonal music. The system extracts musical performance rules from human performance using score information, and generate new expressive performances by applying these extracted rules to unseen scores. To link between scores and human actual performances, we have been proposing the original algorithm based on multiple regression analysis. This paper describes the model of music interpretation and its implementation. The algorithm, compared with the simple regression analysis, can extract musical performance rules with more accuracy. 1. Introduction We are often fascinated with artistic nuance of expressive performance given by virtuosi. Computers are good at replaying the given performance data faithfully. On the other hand, it is not easy for computers to design expressive performances. What a kind of mechanism can produce "expressive" performance? This theme is very interesting not only from the point of music systems but also human science. Among the studies of music interpretation, machine learning of performance rules is one of active targets. Researches of this area are roughly classified into the following approaches: Rule-Based approach, Neural network approach and Case-Based approach. For rule-based example, Widmer developed a music interpretation system on a qualitative model [Widmer, 1996]. The system extracts expression rules (tempo rubato, dynamics, and so on) using Explanation-Based Learning based on Lerdahl & Jackendoff's theories and Narmour's theories. First, positive or negative rules are stored on the basis whether the tempo (dynamics) is longer (louder) than average tempo (loudness). Finally, the numerical values for positive or negative rules are applied to new melodies. Hoshishiba et al. developed a system to generate a normative performance [Hoshishiba et al., 1996]. To extract normative performance rules, the system uses averages of velocity, tempo and pedal from multiple piano performances. For neural network example, Bresin et al. developed a combined system of rule and neural network [Bresin et al., 1992]. The inputs of neural network are a difference of musical interval and an importance of note in melody. For case-based example, Suzuki et al. developed a performance generation system [Suzuki et al., 1999]. Cases of various performances are stored in the system's database. The system generates a synthesized performance with cases searched by performance situation. Hiraga et al. developed a music interpretation system with general rules for phrase, motif and sentence [Hiraga et al., 1997]. The system can generate performance by defining these rules as curve function. 2. Music Interpretation Model Various AI techniques have been applied to extract musical performance rules. Some of generated performances of the early researches have been paid attention to for its possibility of the AI usage toward the artistic activities. However, unnaturalness -deviation from the performance of human player, which is different from that of human beginners, has been pointed out. It can be best summarized in the follow problems: it is difficult to identify the conditional clauses for extracting rules and the target which the system should control is value information and non-linear. In order to solve this problem, we pay attention to formulate the music interpretation model itself, and construct the model as subsystem within the scope of form [Uwabu et al., 1997]. Fig. 1 shows the model. The model consists of five independent modules -the modules of score image recognition, structural rule extraction, musical structure analysis, expression parameter extraction, performance generation. Then, we have been implementing the computational procedures which

Page  00000002 reflect the above model. This paper especially focuses on S core the machine learning mechanism which suited for the mu- Image sic expression, and describes a system that extracts musi- cogiion I Human cal performance rules using the original algorithm basedI Meta-rules Performances on multiple regression analysis. trul Expression Notes Gestalt rules 3. To Extract Musical Performance Rules Well-formedness StrtralRile We implemented the module of expression parameter ex- rus Eraction traction using a modified algorithm based on multiple re- Group(L & J) gression analysis. Widmer's system extracts positive or Center ofnotes with dgnity erfrmance Peak of notes with dignitiy Performance negative rules for the conditional clauses, and does numeri- Narmour's description lJan cal fitting to melodies. On the other side, multiple regression analysis can directly extract the value of expression M Su rules by calculating the matrix simply. To make good use Expression Paraieter of the merit, the problem, identifying the conditional clauses usic Structure & Extraction and non-linearity, has to be solved. Expression Human Perf In this chapter, we explain the effectiveness of applying Rules Human multiple regression analysis. Next, we explain the devices to solve the problem. Performance Tempo, Dynamic,Articulation 3.1 Effectiveness of Applying Multiple Regression Pioriance G neration Analysis Fig. 1 Music Interpretation Model To extract the performance rules, we use two types of input data -score information as explanatory variables and the human performance data as criterion variables. The expression performance rules are extracted as relation linking the two input data. To generate performances, we apply the performance rules to the notes which are in unknown scores with attendant score information. Now, the values of velocity (MIDI format data), the tempo, the duration and the pronunciation position can be used as performances. As score information, the system uses concrete symbols described in the score and implied meanings; phrases or implication-realization structures. The implied meanings are analyzed, using Lerdahl & Jackendoff's theories and Meyer's theories. By multiple regression analysis, the incidence numbers are acquired. These are the expression performance rules. Using these rules, unknown values can be calculated directly from score information. The merit of multiple regression analysis is its simplicity and direct numerical analysis. On the other side, it cannot deal with the and-connected rules. The explanatory variables should be as independent as possible, to obtain good result. In order to solve this problem, we adopted the following procedures. 3.2 Handling of Local Score Information There is local score information which have no width in ON OFF scores, or are with a single note. For example, accent sym- + bols, forte and so on. These have three performance styles. The first style is change of limited part (playing the only o o o target note loudly or not). The second is the cues which 0 0o0 give changes extending from the appearance of them. The third is the middle of former two. If we use one row for one 0 0 0 score information, only stepping changes can be reproduced. o o We use two rows, 'X-on' and 'X-off', for one score infor- accent - LLaccentOFF mation 'X'. By this management, all three styles can be accentON reproduced (see Fig.2). Fig. 2 Handling of Local Score Information 3.3 Logical AND Process In musical pieces, plural score information often exist at the same position. The differences of score information that exist together cause the different performances for the same score information. For example, what accent is independently in existence is probably different from what it exists with a slur. This phenomenon can be reasonably understood

Page  00000003 by music syntax. Multiple regression analysis is a linear fitting between observed quantitative data and explanation variables. The point that one explanation variable is mapped to a concrete control value is intuitive. However, it is not suitable for the analysis of non-linear systems involving logical AND. For handling the logical ANDs, there is a technique which executes multiple regression analysis by using exhaustively pre-acquired conjunctions of explanation variables. This technique has two problems, which are the amounts of calculation and how much meaningful rules it can extract. To avoid computational explosion, we took a hill climbing approach which is adding logical ANDs step-by-step. This approach has an additional merit which more plain rules are extracted in advance. This is also good for musical or instructive use. The system with this technique can behave in two ways by controlling the iteration. One way is extracting more intuitive rules by decreasing the iteration. Another way is raising the preciseness of the fitting by expir varia 0 1 0 1 latory o 1 bles o 1 0 1 0 1 0 0 000 accent slur I 0 1 0 0 1 0 0 1 0 Logical AND 0 1 0 0 1 0 0 1 0 0 00 o0 0 0o accentslur accent & slur ~~ TA A T I-% -A increasing it. Fig. 3 Logical AND Process As mention above, we propose the logical AND process. This is a management that adds new explanation variables from making logical ANDs with them existing in parallel (see Fig.3). Non-linear interaction between score information can be handled by multiple regression analysis. Repeating these managements, the preciseness of fitting arises according to the increase of the additional explanation variables (see Fig.4). 3.4 Score Information Choice by Backward Elimination If we use a large number of explanatory variables, the preciseness of the fitting rises. This is a feature of multiple regression analysis. We can raise the preciseness of the fitting, adding the new variable to score information. However, the performance of unknown score is not good, using the extracted rules. The reason is that the explanatory variables are dependent. Because of bad reciprocity between explanatory variables, the reliable rules are not extracted. As mention above, the iteration of logical AND process adds new explanatory variable, that is a matter of great importance. Then, this procedure is the choice of effective score information by Backward Elimination. This is a management that eliminates the most dependent explanatory variable from score information. The Allocation of Explanatory and Criterion Variables method is also implemented as iteration process in the same way as logical AND process. The score informa- Backward Elimination Logical AND Backward Elimination Logical AND tion choice by backward elimination and the logical Multiple Regression AND process are mutually carried on. First, logical AND Analysis process runs up to maximum fitting. Next, process shifts backward elimination and runs up to maximum fitting. If the fitting does not increase using both methods, it- _e-composition from eration process is finish.idence Numbe Finally, the system can extract the reliable performance Maximum Fitting rules. We propose this combined method -a modified algorithm based on multiple regression analysis (see END Fig.4). Fig. 4 The Original Algorithm Based On Multiple Regression Analysis 4. Experimental Result By using the algorithm mentioned above, musical performance rules have been extracted for Chopin's "Walzer Op.64 -2 ". Here, the first half of "Walzer Op. 64-2" exposition is teacher's score, and the second half of that is unknown score. The definition of fitting of between the human performance and re-synthesized performance is correlation coefficient. The first experiment is musical performance rules extracted by teacher's score are applied to teacher's score itself, in order to show how teacher's score is reproduced. Fig.5 shows the result. "Player" is velocity of human performance,

Page  00000004 Velocity Velocity 120 - Player 90 - -Player 100 - System 80. System 80 70 60 60 40 50 20 40 0 I 30 1 11 21 31 41 51 61 71 1 11 21 31 41 Note number Note number Fig. 5 Reproduction offirst harf of "Walzer" exposition Fig. 6 Reproduction of second harf of "Walzer" exposition (velocity) (velocity) Table. 1 Reproduction of second half of "Walzer" exposition (Correlation Coefficient) Velocity Tempo Gatetime Pronunciation position Simple 0.483 0.325 0.540 0.312 Analysis Original 0.854 0.826 0.637 0.955 Algorithm and "System" is that of re-synthesized performance. Next, the rules are applied to unknown score, and the result shows Fig.6. The Table.1 shows the result of the correlation coefficient for re-synthesized performance of unknown score. The correlation coefficient for velocity obtained by the original algorithm is 0.854. The correlation coefficient for that obtained by the simple regression analysis is 0.483. The system performance almost follows the human performance without deviation of the transition. Also, iteration process finished up with maximum fitting 0.962, and then the number of iteration of logical AND process is 2 and that of backward elimination is 2. So, total number of explanatory variable is not changed, but conjunction of variables are added to score information and dependent variables are eliminated from that. This result shows that our algorithm can extract more effective rules than simple regression analysis. 5. Conclusion In this paper, we have described the music interpretation model for traditional tonal music and the details of the music interpretation system using a modified algorithm based on multiple regression analysis. In the point, we solved the problems for music interpretation system and implemented the model with various techniques. Our proposing algorithm, compared with the simple regression analysis, can extract musical performance rules with more accuracy. We would like to consider the effect of the local maximum problem, the essential problem of multiple regression analysis. References [Bresin et al., 1992] R.Bresin, G.De Poli and A.Vidolin. Symbolic and sub-symbolic rules system for real time score performance. Proc. of ICMC 1992, pp. 211-218, 1992. [Hiraga et al., 1997] R.Hiraga and S.Igarashi. Psyche: University of Tsukuba, Computer Music Project. Proc. oflCMC 1997, pp. 297-300, 1997. [Hoshishiba et al., 1996] T.Hoshishiba, S.Horiguchi and I.Fujinaga. Study of Expression and Individuality in Music Performance Using Normative Data Derived from MIDI recordings of Piano Music. Proc. of ICMPC 1996, pp. 465-470, 1996. [Suzuki et al., 1999]. T.Suzuki, T.Tokunaga and H.Tanaka. A case based approach to the generation of musical expression. Proc. oflJCAI 1999, pp.642-648, 1999. [Uwabu et al., 1997] Y.Uwabu, H.Katayose and S.Inokuchi. A Structural Analysis Tool for Expressive Performance. Proc. oflCMC 1997, pp. 121-124, 1997. [Widmer, 1996] G.Widmer. Learning Expressive Performance: The Structure-Level Approach. Journal of New Music Research, Vol.25. pp. 179-205, 1996.