Harmonizing in Real-Time with Neural NetworksSkip other details (including permanent urls, DOI, citation information)
This work is licensed under a Creative Commons Attribution-NonCommercial-NoDerivatives 3.0 License. Please contact email@example.com to use this work in a way not covered by the license. :
For more information, read Michigan Publishing's access and usage policy.
Page 00000001 Harmonizing in Real-Time with Neural Networks Karin Hothker firstname.lastname@example.org Dominik Hornel email@example.com Institut fiir Logik, Komplexitat und Deduktionssysteme Universitiat Karlsruhe (TH) Am Fasanengarten 5 D-76128 Karlsruhe, Germany ABSTRACT The task of real-time harmonization is characterized by the fact that only incomplete knowledge about the input melody is available. In particular, the global structure (phrase segmentation, overall harmonic relationships) and the longer-term melodic and tonal development cannot be taken into account. In this paper we introduce the method of dynamic tonal center forecasting to improve the real-time performance. The main idea of our approach consists in predicting changes of the underlying key, i.e. tonal centers, and in learning harmonic progressions relative to these tonal centers. In this way, both the short-term harmonic prediction performance and the longer-term tonal development can be improved. Our demonstration system HARMOTHEATER offers the possibility to interactively evaluate the proposed real-time harmonization system on arbitrary melodies entered on a MIDI keyboard. The results can be compared to a posteriori harmonizations of the same melodies based on a fundamental key which are carried out by the neural network-based system HARMONET described in . Neural networks for both systems have been trained with four-part chorales of J. S. Bach and Max Reger. 1. INTRODUCTION Many different computational paradigms have been applied to solving the task of inventing four-part harmonizations of a melody. For example in , an elaborated rule-based system is presented to describe the Bach chorale style,  is based on the LerdahlJackendoff approach,  harmonizes melodies by completing them with a Boltzmann machine. The system HARMONET uses feed-forward neural networks to model harmonic knowledge. In , four-part chorales are evolved by a genetic algorithm. The mentioned systems harmonize melodies a posteriori, such that they can take advantage of the overall melodic structure. Real-time accompaniment systems must predict the harmonic development of a song when playing unknown melodies. Several systems have been developed for the learning of Jazz style chord sequences. In , a learning algorithm is used to perform real time chord prediction in Jazz song contexts. The basic idea is to learn a probability estimation of a chord occurrence given its predecessors. In , a predictor combines prior knowledge (acquired by means of offline learning) with on-the-fly knowledge (acquired by means of on-line learning). The hybrid system is composed of a neural network which encompasses prior knowledge about typical chord sequences, and a rulebased sequence tracker which looks for recurrent chord sequences in real time. Both systems take advantage of the fact that Jazz songs are organized in terms of well-defined chord blocks, called sections, which usually last 4 or 8 measures. These sections are repeated along the song, sometimes with slight differences. The situation is more complicated if the melody cannot be easily structured, e.g. for chorale melodies. For this reason, only a few researchers have dealt with real-time harmonization of classical harmonization styles. One approach addresses the task of inventing Bach-like chord progressions in real time learned by rule-based neural networks . However, all melodies are assumed to be in a fixed C major key. 2. LEARNING HARMONIC PROGRESSIONS The harmonization task consists in finding a suitable sequence H(0), H(1),... of harmonies or chords accompanying a given melody. Each harmony H(t) is usually determined based on a local window which describes its harmonic and melodic context. In HARMONET, neural networks are trained to predict the harmony at time t, given information about previous harmonies and the melody from time t - 1 to time t + 1. The training patterns for the harmonic prediction (HP) network are obtained by extracting information from a local context at any time t as follows (see Figure 1): The target to be learned is the harmony H(t) which accompanies the melody quarter s(t). The input vector consists of the left harmonic (H(t - 3), H(t - 2), H(t - 1)) and the surrounding melodic context (quarters s(t - 1), s(t), s(t + 1)). As further input features, we use the position of t relative to the begin
Page 00000002 s(t-1) s(t) s(t+1) H(t -3) H(t-2) H(t-1)? phr(t) str(t) Neural Net No. H(t) At At At Figure 1: Input features and target value at quarterbeat t. The question mark indicates the harmony to be learned. ning or the end of the musical phrase (phr(t)) and we distinguish between stressed and unstressed positions of t (str(t)). The vector H(t) describes which harmonic function relates the key of the harmony to the fundamental key of the piece . From the set of harmonic funtions relevant in our context, the most important are the tonic T, the dominant D, the subdominant S and the tonic parallel Tp (respectively I, V, IV, VI in Roman Numeral notation). For real-time harmonization, phrasing and stressing information (phr(t) and str(t)) as well as the further melodic development s(t + 1) are not available in general. When this information is omitted, however, the performance of the corresponding real-time harmonic prediction network (RTHP) drops. Furthermore, melodies played in real time are not necessarily based on a fixed fundamental key. Therefore the harmonization should be able to dynamically adapt to modulating melodies. In Bach chorales, we can find many local key changes which allow us to represent a harmony sequence relative to a higher-scale sequence of dynamically evolving tonal centers TC(0), TC(1),.... In the following, we present a two-scale model for recognizing, learning and producing modulating harmonizations and show that the classification performance for harmonic progressions can be significantly improved if the harmonies within a chorale are analyzed relative to local tonal centers. 3. TONAL CENTER ANALYSIS The analysis and recognition of tonal centers within a music piece presents a challenging task which usually does not have an unambiguous solution. Rule-based approaches to the automatic computation of tonal centers can be found in  and . The basic idea in  is to search for harmonic transitions that confirm a tonal center. Thereafter potential tonal centers and corresponding harmonies are determined. Using a tonal center model, it is tested for each chord whether it may be represented relative to the tonal center. Unexplained sections of the harmony sequence are treated with a specific modulation model. Since the algorithm is not very flexible and difficult to implement, we have chosen a statistical approach to tonal center recognition based on a training set which gives similar results in most cases. It uses a kind of bootstrapping strategy: First a harmonic analysis of the training examples is performed. From this analysis the frequencies of occurrence (prior probabilities P) of the analyzed harmonic functions are computed. These values are used to reanalyze a music example relative to all 24 major and minor keys. At time t a key is chosen as a tonal center TC(t) if it maximizes the sum of prior probabilities summed up over a fixed context of k harmonies: k TC(t) = argmax 1 P([H(t + i)]key) key i=-k (1) where [H]key represents harmony H as a harmonic function relative to key. If one or more context harmonies cannot be represented relative to a tonal center, then this tonal center is not considered. We got the best results for k = 1 in our experiments. Larger context sizes lead to longer sections for tonal centers, but local changes of tonal center are less reliably recognized. In a second step isolated tonal centers (TC(t) $ TC(t - 1) and TC(t) / TC(t + 1)) are eliminated and the extensions of the tonal center sections are determined. Figure 2 shows two analyses of the first phrase of a Bach chorale, one relative to the fundamental key and the other considering tonal centers. Notice that the tonal center analysis leads to a simplified musically coherent harmonic description because the second half of the phrase (S-SS-T-S or IV-bVII-I-IV in A major) is identified as a simple cadence (T-S-D-T or I-IV-V-I) in the new tonal center D major. A: T T D T S T S T I II I A: T T D TS D: T S D T 3 Figure 2: Harmonic analysis of the beginning phrase of the Bach chorale "Ach bleib bei uns, Herr Jesu Christ" (a) relative to fundamental key (top) (b) considering tonal centers (bottom) 4. TONAL CENTER LEARNING Our real-time harmonization system uses a twoscale learning model. At the higher level, a sequence of tonal centers is determined. At the lower level, harmonic progressions relative to these tonal centers are computed. Once the tonal centers have been determined, the harmonic progressions are learned by the HP or RTHP network. Since the representation of harmonies relative to their tonal centers yields a simplified harmonic
Page 00000003 Table 1: Percentage of correctly classified harmonies for (RT)HP networks with(out) tonal centers (a) trained on 100 Bach chorales, tested on 43 (b) trained on 20 Bach chorales, tested on 15 (c) trained on 17 Reger chorales, tested on 6 (a) HP network RTHP network (b) HP network RTHP network (c) HP network RTHP network fixed key 69.1 67.7 70.0 67.3 49.7 46.7 tonal centers 83.1 80.9 79.1 78.9 69.3 64.2 description, the classification performance of the networks can be significantly improved. To show this effect, we have chosen two sets of Bach chorales and a set of Reger chorales. For every set, we trained HP and RTHP feed-forward neural networks with and without tonal center analysis. The networks were tested on independent test sets of the same style. Table 1 shows that that the percentage of correctly classified harmonies increases by 9.1 to 19.6 percent if tonal centers are considered. Note that the input features of a network are identical whether the data is analyzed with respect to tonal centers or relative to a fixed fundamental key. This implies that the increase of the classification rate within the rows of Table 1 is due entirely to the tonal center analysis, and not to the selected input features. When comparing the HP and the RTHP networks for each setting, the classification performance of the HP network is superior to the performance of the corresponding RHTP network, due to the additional features like the further melodic development in the HP input patterns. The classification rate for the Reger style is generally lower, because the networks have to learn harmonizations of higher complexity from fewer training examples than for the Bach style. An essential component of the system is the computation of the tonal centers. We use a tonal center v I r i - change (TCC) network which decides whether a tonal center change takes place or not, and a tonal center prediction (TCP) network which computes a new tonal center if the TCC network has decided to change the tonal center. The TCC network takes its decisions according to the melody notes s(t), s(t - 1) and s(t - 2), the previous harmony H(t - 1) and the previous tonal center TC(t - 1). The TCP network selects a new tonal center TC(t) according to the current melody note s(t), the previous tonal center TC(t - 1) and the previous harmony H(t - 1). Figure 3 shows a melody harmonized with the proposed approach. 5. THE HARMOTHEATER SYSTEM Figure 4 gives an overview of the HARMOTHEATER system. The Real-Time Module offers the possibility to interactively evaluate the proposed real-time harmonization model on arbitrary melodies entered on a MIDI keyboard. The results can be compared to a posteriori harmonizations without tonal centers of the same melodies carried out by the neural network-based system described in  (Offline Module). Neural networks for both modules have been trained with fourpart chorales of J. S. Bach and Max Reger. Additionally, a "random style" with untrained neural networks and original examples give an idea of the space of potential harmonizations. The user interface is kept very simple so that it can be used by people with little musical background knowledge. It is built like a stage - hence the name HARMOTHEATER - on which the "virtual composers" Reger, Bach and Random are acting (see Figure 5). The Real-Time Module performs the harmonization in the following way: When a note is played by the user, the TCC network decides whether a tonal center change takes place or not. If necessary, the TCP network predicts the next tonal center (Tonal Center Prediction). The RTHP network computes the harmony relative to the current tonal center. Then, the pitches for the alto, tenor and bass voice are chosen according to the harmonic function and additional rules from musicological text-books (Real-Time Harmonization). The resulting harmony is immediately sent to the output port. REAL TIME MODULE pitch, Tonal Center tonal center. Real Time haumony... Prediction p -Harmonization real time hanmony playback I h......izato O ~ s.............hlist y P Harmonization Quantization uion ~~~O F L N MD..........u.....D LE.................... Figure 4: Overview of the HARMOTHEATER system C: T D T S G: Sp T e: D t Figure 3: Harmonization by HARMOTHEATER modulating from C major through e minor to G Major (Bach style)
Page 00000004 Figure 5: Screenshot of the HARMOTHEATER user interface By selecting a composer on the stage and clicking the Compose button, the last melody played in real time is retrieved from memory and harmonized a posteriori in the chosen style: First, metrical information is inferred from the note durations via quantization, resulting in a score of the melody (Quantization). Secondly, the melody is harmonized with neural networks which take advantage of additional features such as the next melody note (Offline Harmonization). Finally, the harmonized melody is played. 6. CONCLUSION We have presented a system that harmonizes arbitrary melodies in real time. The system allows the comparison of the results to a posteriori harmonizations of the same melodies carried out by a neural network-based system. A significant increase in performance could be achieved by dynamically forecasting tonal centers based on the development of the melody. The proposed method is suitable for melodies improvised on a keyboard in real time, but can also be used to harmonize melodies a posteriori. For nonimprovised melodies, however, the results are sometimes poorer with tonal center prediction than without, if the current tonal center is not correctly predicted, thus interrupting the flow of the overall harmonic development. One could therefore try to concurrently evolve harmonies and tonal centers within a genetic framework assessed by the above-mentioned networks as proposed in  for the completion of melodies. Initial experiments suggest that harmonizations are created which exhibit more elaborated tonal structure, maintaining the quality of local harmonic predictions. References  M. I. Bellgard and C. P. Tsang, 1996. "On the Use of an Effective Boltzmann Machine for Musical Style Recognition and Harmonisation." Proceedings of the 1996 International Computer Music Con ference. Hong Kong: International Computer Music Association, pp. 461-464.  U. S. Cunha and G. Ramalho, 2000. "Combining Off-line and On-line Learning for Predicting Chords in Tonal Music." to appear in Proceedings of the 2000 AAAI Conference, AI&Music workshop. Austin, Texas.  K. Ebcioglu, 1988. "An Expert System for Harmonizing Four-part Chorales." Computer Music Journal 12(3), pp. 43-51.  H. Hild, J. Feulner and W. Menzel, 1991. "HARMONET: A Neural Net for Harmonizing Chorals in the Style of J.S. Bach." Advances in Neural Information Processing 4 (NIPS 4). J. E. Moody, S. J. Hanson, R. P. Lippmann (eds.), Morgan Kaufmann, pp. 267-274.  D. H6rnel and K. H6thker, 1998. "A Learn-Based Environment for Melody Completion." Proceedings of the XII Colloquium on Musical Informatics, Gorizia, pp. 121-124.  D. H6rnel and W. Menzel, 1998. "Learning Musical Structure and Style with Neural Networks. Computer Music Journal 22(4), pp. 44-62.  J. Langnickel and J. Feulner, 1998. "Informationsstrukturen in der Musik: Analyse und Modellierung mit Methoden der Informatik an der Universitat Karlsruhe." Musik und Neue Technologie Bd. I - KlangArt-Kongress 1995. Universititsverlag rasch, Osnabriick 1998, pp. 295-313.  H. J. Maxwell, 1992. "An Expert System for Harmonizing Analysis of Tonal System." Understanding Music with Al, M. Balaban, K. Ebcioglu, 0. Laske (eds.), AAAI Press, pp. 335-353.  R. A. McIntyre, 1994. "Bach in a Box: The Evolution of Four Part Baroque Harmony Using the Genetic Algorithm." Proceedings of the 1994 IEEE Conference on Evolutionary Computation. Orlando, Florida, pp. 852-857.  R. R. Spangler, R. M. Goodman, and J. Hawkins, 1998. "Bach in a Box - Real-Time Harmony." Advances in Neural Information Processing 10 (NIPS 10). M. I. Jordan, M. J. Kearns, S. A. Solla (eds.), MIT Press, pp. 957-963.  H. Taube, 1999. "Automatic Tonal Analysis: Toward the Implementation of a Music Theory Workbench." Computer Music Journal 23(4).  B. Thom and R. Dannenberg, 1995. "Predicting Chordal Transitions in Jazz: the Good, the Bad, and the Ugly." Proceedings of VI Workshop on Music and AL International Joint Conference on Artificial Intelligence IJCAI-95. Montreal.