Page  109 ï~~NEURAL NETWORKS VS. RULES SYSTEM: EVALUATION TEST OF AUTOMATIC PERFORMANCE OF MUSICAL SCORES. G. U. Battel*, RL Bresin+, G. De Poli+, A. Vidolin* *Conservatorio di Venezia "Benedetto Marcello" +D.E.I. - C.S.C. University degli Studi di Padova via Gradenigo 6/a - 35129 Padova - Italy phone: +39 49 8287631- fax: +39 49 8287699 E-mail: rb@csc.unipd.it, depoli@dei.unipd.it Abstract Musicians, according to the instrument they play, make loudness, duration and timbre deviations on the notes of the score they are performing, since the traditional musical notation does not suffice the composer's real intentions, and leaves some freedom's degrees to the player himself. These deviations determine the performing characteristics of a pianist in respect to another one. Furthermore a literal performance of a musical score would lead to an extremely mechanic and unnatural performance to the listener's ears. The present work starts from the Sundberg's and co-workers' researches on automatic scores' performance [Friberg, 1991] [Sundberg et al., 1991] and continues the research on real-time piano scores performance by mean of particular artificial neural networks. In our previous works [Bresin et al., 1991] [Bresin et al., 1992] we showed the possibility to build some neural networks which can learn some performing rules. These nets show good generalisation properties, and, after a training phase, are able to do real-time performances of any score introducing some appropriate deviations. In the present research we propose a comparison test between various performances to evaluate, by mean of listening tests, the use of trained neural nets in automatic performance. 1 Introduction Speaking of the history of performance, the most striking and unequivocal limit is the possibility to record the execution: without this system of preservation of the performing action, you can only debate on the report given by the contemporaries but not on the sound document, that is the true object of the study on the performance. The first examples were the recordings on wax cylinders, and on player piano rolls. At the beginning of the 20th century, with the recordings on records, it is reached a good level also in the loudness recording, with the electric system. The possibility to quantitatively measure the recording's data was another limit to the study of the performance characteristics: in this way the analysis does not stop at the qualitative judgement, but it is possible to explore the weight, the quantity, and the ratios of the expressive variables, which a computer can measure with great precision. New technologies are fundamental for the contemporary musical analysis. The analysis of performance can be seen as a particular field of the more general analytical-musical discipline. The knowledge of the parameters, on which the performer acts, and of the grounds of her/his behaviour, are necessary to study the sound result of each performing work. While it is easy to understand which are the parameters modified by a performer (i.e. a violinist can modify loudness, intonation, and can introduce vibrato), it is a more complex task to explain the performing choices, that are not absolutely subjective: musicians and public have similar opinions. If we isolate the analysable parameters (i.e. variations of duration) in a musical work, it is possible a scientific explanation of the many nuances of the performing action. ICMC Proceedings 1994 109 Expression Analysis

Page  110 ï~~2 Researches on Automatic Performance 4.1 Methods Materials Since 1983, at the Speech Communication and Music Acoustic Department of the K.T.H. of Stockholm University, Askenfelt, Friberg, Frydn, Sundberg and others, developed a performing rules system to study musical performance. The main task is to search the way used by the performer to translate the signs on the scores, and how the listener perceives this communication. In this case the starting point is the written score. In the first paper about these subjects [Sundberg et al., 1983], the authors define the rules applied by the performer as pronunciation rules. The principal aim of the K.T.H. researchers becomes the description of the principles that lead the performer to introduce numerous expressive variations, by means of the manipulation, according to the instrument, of duration, loudness, intonation, vibrato. The musician of the group, Friden, listened to the computer generated performances, and suggested how to improve them. These suggestions have been translated in a set of performing rules. In this way the musical competence of Lars Friden has been considered a well documented fact, and a first object of the research. The correct formulation of each rule is the result of careful listening of musical examples, made for each rule. This method is called analysis-by-synthesis method. The system includes about twenty rules. 3 Automatic Performance by Means of Neural Networks The present work starts from the Sundberg's and co-workers' researches on automatic scores' performance [Friberg, 19911 [Sundberg et al., 1991] and continues the research on real-time piano scores performance by, mean of particular artificial neural networks. In our previous works [Bresin et al., 1991] [Bresin et al., 1992] we showed the possibility to build some neural networks which can learn some performing rules. These nets show good generalisation properties, and, after a training phase, are able to do realtime performances of any score introducing some appropriate deviations. In the present research we propose a comparison test between various performances to evaluate, by mean of listening tests, the use of trained neural nets in automatic performance. 4 Description of the Experiment Two tonal melodies were chosen to be used in this study. The first one is the theme of the third tempo of the Mozart piano sonata K. 284, the second is the theme of the first tempo of the Mozart piano sonata K. 331. For the experiment it has been used a subset of Sundberg's rules: the selected rules influence mainly the relations between near notes and don't involve greater segments. In this way the selected rules are more suitable to a structurally simple musical example and to a strictly classical repertoire. This formal need, and since the sonata K. 331 was previously used in other works [Friberg, 1991], brought us to the choice of the two themes of Mozart. The starting hypothesis are two: it is possible to obtain an acceptable performance with small performing deviations and without involving the great form; for the intrinsic characteristics of Mozart's music, the melodies are performable in a meaningful way on their own, and the deviations sound pleasant and understandable to the listener. We performed each of the two melodies in three ways: deadpan (with no expression); with expression given by a subset of Sundberg's performance rules, in the following these melodies will be called rules-melodies; with expression given by two neural networks trained with the same subset of Sundberg's rules, in the following these melodies will be called nnmelodies. The rules that we applied are the following [Friberg, 1991]: - durational contrast - melodic charge - articulation of repetition - leap tone duration - leap articulation - high loud - phrase. As an example of comparison, in figure 1 are showed the time (in milli-seconds) deviations in the two non deadpan performances of the theme of the K. 284 sonata. The 0 value is the nominal value, it corresponds to the deadpan version (i.e. no time deviations). Expression Analysis 110 ICMC Proceedings 1994

Page  111 ï~~E -s cry b CD -10 a.2 aJ.is -2o -2s - - Neural network Notes Figure 1: Time deviations in the K 284 sonata (particular). Equipment The melodies were performed via MIDI by a Yamaha Disklavier Grand Piano connected to a 80386/40MHz PC compatible. The Yamaha Disklavier Grand Piano, in a similar way to the piano with sensors invented by Shaffer [Shaffer, 19811, detects the position, and the movement of the hammers, and in addition the solenoids allow the movement of the keys. The using of the Disklavier allows an ideal condition both for the listener, who listen to the examples performed on a real piano, and for the performer, so that we can detect all the deviations made by the pianist with the greatest precision and relyability. A DR A DRO previously written in a simple language, applying some symbolic rules. Many of these rules were chosen from those proposed by Sundberg and coworkers, other rules are a modified version of Sundberg rules, and there are also completely new rules. In respect to the Sundberg rules system, MELODIA considers also rests, gracenotes, staccato, and legato. To obtain the nn-melodies we trained two neural networks: one for the loudness deviations, and another for the time deviations (see Figure 2). Subjects Subjects for the study were 20 professional musicians, and students of the last years of the music conservatory of Venice, who volunteered for the experiment. They were 12 men and 8 women. The youngest was 15 years old and the oldest 32 years old. 17 of them were pianists: undergraduate and postgraduate. Procedure Subjects were asked before the experiment began that they had to read a paper with the instructions for the experiment and the cells where to write their judgements. Each subject was given a copy of the paper. In the paper was explained that the aim of the experiment was to compare three different piano performances built with the help of computer. In the paper were listed the title and the authors of the melodies. The text of the paper was the following: "You will listen to three different performances of each melody, and you'll have to evaluate the musical quality of each performance as if a student is playing. You must your evaluation with ND MC LP S P AR LA Figure 2: Neural network for time deviations (ND= Nominal Duration; MC = Melodic Charge; IP = Leap Presence; S = Semitones in a leap; P = Phrase; AR = Articulation of Repetition; LA = Leap Articulation). To obtain the deadpan melodies and the rulesmelodies we used a program called MELODIA developed at the C.S.C. [Bresin, 1993]. The program allows to perform via MIDI any score, ICMC Proceedings 1994 111 Expression Analysis

Page  112 ï~~a note from Ito 10, using all the scale if possible. 1 is the worst note, 10 is the best one: avoid to give to much notes in the intermediate range (between 5 and 6), try to use extreme values (1 and 10). The judgement doesn't have to be to critic in a absolute sense, but has to show the qualitative differences, which you find in the three different performances of the same melody. There aren't right or wrong answers: the aim of this test is to find the performance you think is the best. Between two melodies you have 30 seconds of time to judge the previous three different performances of the melody just heard, and to write it in the apposite cells." The three different performances of each melody were played in a random order. The total duration of the test was 4'50". 4.2 Results The results can be analysed by using an ANOVA with repeated measures on each of two factors (version, melody). The analysis shows significant effects for version [F(2,38)=10.36; p=.000]. The nn-melody is the most preferred version (preference rating, 6.73), followed by the rulesmelody (6.3) and the deadpan (4.28). The most important result is the preference given by the subjects to the performed melodies (rulesmelodies, and nn-melodies), that obtained a mean score 2 points greater than the deadpan version. Furthermore, if we consider the scholastic Italian tradition, the subjects gave a more than fair rating to the performed versions and an unsatisfactory rating to the deadpan version, even if these values don't interest the extreme values of the scale (from 1 tol10) (see Figure 3). 4.3 Discussion The present research has a substantial novelty: the sound of artificial performances is produced by a real piano, and not by a synthesiser. This idea does not derive from a desire of sound fidelity, but from the possibilities of a greater completeness and reliability in the judgement of the hypotheses on the performance; moreover, the total similarity between real, and artificial performance becomes a stimulus of primary importance to formulate new hypotheses to test in future researches. The analysis of performing rules starts from the dead-pan version, but a translation of a series on notes into exact mathematical ratios of durations is correct only from a theoretic point of view, and does not reach what we listen in a real performance. While playing a real piano, the lack of instrumental sensibility risk to miss the experiment. In fact, in this case, the instrumental performing lacks gets the upper hand on the lack of expressive deviations: the listener is inclined to refuse both the dead-pan and the performed versions, as to much far from the reality, instead to critically evaluate the various degrees of sound emphasis due to harmonic context, phrase borders, etc. The subject, who has to judge and to evaluate the performances in a listening test, must imagine to listen to piano students, otherwise her/his judgement will result altered and not much reliable. Therefore, it is necessary to find the way to define the notation in use, so that the computer can communicate to the instrument a more completely translation of the score, close to the real performance. The subjects found that the greatest difficulty was the small difference between the three performances of the same melody, often they said: 'The differences are very thin". This fact validates the initial hypothesis, and stresses the need to continue our research considering a larger number of performing rules and a larger repertoire. The principal outcomes of this experiment are the equivalence of the n-melodies and the rulesmelodies, and the preference they had in respect to the deadpan melodies. 5 Conclusions In our opinion, the main reason for the preference of the nn-melodies in respect to the rlesmelodies is that the deviations depends from the contribution of more rules. When only one of these contributions is the responsible of the deviations, then neural networks and Sundberg's rules give the same results. When more factors " 6 1 0 0K2 *K3 Z84 D RM NNM Version I i Figure 3: Means for the interaction between the effects of version and melody (D = deadpan; RM = rulesmelody; NNM = nn-melody). Expression Analysis 112 ICMC Proceedings 1994

Page  113 ï~~act together, then the additive action of Sundberg's rules system, and the properties of interpolation proper to the neural networks give different results. It is known that NNs are able to represent any non linear function, in particular they work as good non linear interpolators if the choice of the architecture, and of the training patterns was suitable. In the performance of musical scores there are many factors which influence the performance, and they interact in a complex way. So it is important to identify a topology of the net with good generalisation properties, in order to have meaningful results only with a limited number of significant training patterns. If we compare our NNs with a physiological system we could say that the input neurones correspond to the optic nerves of the pianist and the output neurones corresponds to the hands of the pianist himself [Bresin et al., 1994]. From this observation and from the results of the test it comes out that neural nets follow strategies which are closer to the performing action of a human performer, and so they can simulate in a better way the process of performance (see Figure 1). References [Bresin et al., 1991] Bresin R., G. De Poli, A. Vidolin. Un approccio connessionistico per il controllo dei parametri nell'esecuzione musicale, Proceedings of the IX Colloquio di Informatica Musicale, Genova, pp. 88-102, 1991 [Bresin et al., 1992] R. Bresin, G. De Poli, A. Vidolin. Symbolic and sub-symbolic rules system for real time score performance, Proceedings of the 1992 ICMC, International Computer Music Association, San Francisco, pp. 211-214, 1992 [Bresin, 1993] R. Bresin. MELODIA: a program for performance rules testing, teaching, and piano scores performing, Proceedings of the X Colloquio di Informatica Musicale, Milano, pp. 325-327, 1993 [Bresin, 1994] R. Bresin, C. Vecchio. Analysis and synthesis of the performing action of a real pianist by means of artificial neural networks, Proceedings of the 3rd ICMPC (International Conference for Music Perception and Cognition), University of Liege, 1994 IiFriberg, 1991] A. Friberg. Generative Rules for Music Performance: A Formal Description of a Rule System, Computer Music Journal, vol. 15, No. 2, pp. 56-7 1, 1991 [Shaffer, 1981] L. H. Shaffer. Performances of Chopin, Bach and Bartok; studies in motor programming, Cognitive Psychology, 13, pp. 326-376, 1981 [Sundberg et al., 1983] J. Sundberg, L. Fryddn, A. Askenfelt. What tells you the player is musical? An analysis-by-synthesis study of musical performance, in (J. Sundberg, ed.) Studies of Music Performance, Royal Swedish Academy of Music, Stockholm, 1983 [Sundberg et al., 1991] J. Sundberg et al. Performance Rules for Computer-Controlled Contemporary Keyboard Music, Computer Music Journal, MIT Press, vol. 15, No. 2, pp. 49-55, 1991 ICMC Proceedings 1994 113 Expression Analysis