Page  00000001 Integrating Bayesian Networks with Logic Programs for Music R. Morales Laboratorio de Informatica Musical Universidad de Guanajuato Guanajuato, Mexico roberto@quij ote.ugto.mx Abstract An adequate representation formalism for music must be able to express relations between musical objects, such as intervals between notes or counterpoint rules between voices, as well as preferences between musical objects. Logic provides an adequate formalism to express relations while Bayesian networks can be used to capture preferences as probability distributions. Neither representation is adequate for both aspects, so a reasonable approach is to combine both techniques. This paper describes how to augment Bayesian networks with logic nodes, that can have arbitrary logic programs in the form of Horn clauses. In particular, each logic node represents a binary stochastic variable that express whether or not a particular relation holds between other variables. This allows to express relational concepts and include background knowledge in a Bayesian network. The approach is illustrated with the representation of counterpoint rules. 1 INTRODUCTION Musical composition generates symbolic representations (i.e., musical scores) of musical ideas. Such ideas are based on subjective temporal interpretations of auditive events. The events are characterized by their frequency, amplitude and its envelope (which determines the quality of tone or pitch). Such elements, which define the musical characteristics of the musical instruments, are part of the material which a composer uses to propose its aesthetic solutions. During this process, a composer can follow a set of implicit or explicit rules to guide his/her preferences and express his/her ideas. Our goal is to represent musical criteria rules that can be used for musical analysis and generation. Such representation must allow us to express relations in music (such as intervals or progression in motion of the notes) and preferences between rules. Logic provides the right formalism to express relations between musical objects, while preferences can be captured with probability distributions. Bayesian networks [6] have been established as a common and powerful tool to treat domains with un E. Morales, L.E. Sucar Computer Science Department ITESM - Campus Morelos Temixco, Morelos, 62059, Mexico {emorales,esucar}@campus.mor.itesm.mx certainty. However, due to its propositional framework, it is difficult to express relations between variables or to incorporate additional background knowledge. More recently, there has been some work in combining probabilistic graphical models, mainly Bayesian networks, and first-order logic. These developments can be classified into two main approaches: attaching probabilities to logical sentences and including logical expressions within a Bayesian network. An example of the first approach is the first-order probabilistic logic introduced by Halpern [2] and extended by Koller [3]. The second approach is taken in the probabilistic network of predicates presented by Lin [4]. In the first-order probabilistic logic of Halpern [2], probabilities are expressed using a modal operator, Pr. A formula is associated with a degree of belief, which denotes the set of worlds where the formula holds. This logic allows the interleaving of first-order quantifiers and modal operators, and it can represent conditional probability expressions. Koller and Halpern [3] use Markov Networks [6] to represent independencies in this logic, which are used to handle irrelevance and substitution. These ideas are applied to statistical knowledge bases. Lin [4] introduces a probabilistic network where nodes are interpreted as unary predicates. This model can represent recursive relations, imprecise observations and more general explanations. The nodes in this network represent event types (unary predicates) and the links represent "isa" and "feature" relations. Given a set of observations, it uses an abductive reasoner to get the best explanation. This network of events is restricted to unary predicates so it can not represent relations that require higher-order predicates. In this paper, we followed the second approach, that is, including logical expressions within a Bayesian network. In particular, Bayesian networks are augmented with logic nodes that represent binary variables expressing arbitrary relational concepts. Instead of combining first-order logic and probability in a "single" representation, we propose a hybrid representation with logical and probabilistic parts ("nodes") that interact through a hybrid reasoning mechanism.

Page  00000002 The paper also show how a logic node can be converted into a stochastic variable, with a deterministic conditional probability matrix, and used within a Bayesian network. In particular, we illustrate the approach with the representation of counterpoint rules. Section 2, discusses the proposed representation and its inference mechanism. Section 3 explains how the proposed approach can be applied to music. Finally, section 4 gives conclusions and future research directions for this work. 2 REPRESENTATION AND REASONING The proposed representation is a combination of logic and probabilistic nodes. Logic nodes are arbitrary logic programs in the form of Horn clauses and probabilistic nodes are multi connected Bayesian networks. The hybrid network is, at this stage, singly connected (poly-tree). Logic and probabilistic nodes have common variables; that is, some random variables could be inputs to a logic node. The logic node, which express a relation between its input variables, represents by itself a stochastic binary variable. This node evaluates to true when the relation that is expressed in it holds for certain values of its input nodes. Since the input nodes are stochastic, the relation might hold for some values with certain probabilities and not for other with another probability values. A logic node can be connected to other random variables of a Bayesian network. The probabilistic nodes "see" a logic node as a binary variable with associated probabilities. We assume that the input variables to a logic node are marginally independent. Reasoning within a probabilistic node is performed using traditional probabilistic propagation techniques; while in the logic nodes reasoning is based on Prologstyle resolution combined with stochastic simulation. The simplicity of the reasoning mechanisms employed on Bayesian networks and logic programs is kept, while gaining expressive power by combining both types of representations. Within our formalism, wherever there are only stochastic variables, the probabilistic reasoning mechanism used in Bayesian networks is used [6, 5]. Reasoning within logic nodes uses a Prolog resolutionbased reasoning mechanism. When we have links between probabilistic and logic nodes we have to combine the results from the above reasoning strategies. Consider two stochastic variables x and y which have a relation R(x, y) between them and from which we would like to have another stochastic binary variable z (representing the probability values of the relation). The true/false probabilistic values of z depend on the probabilistic values of x and y and whether or not the relation R is satisfied or not for those values. We can express this, considering that x and y are independent, as follows: P(z) j j R(x, y)P(x)dxP(y)dy where P(i) is the probability distribution of variable i. In other words, we are estimating the probability values of a new variable z from a relation R between variables and their probability values. Considering discrete variables (as they are usually used in Bayesian networks), we have: P(z)= R(x, y)P(x)P(y) Y X Following this approach, the stochastic representation of logic nodes are all the possible combinations of values of the stochastic variables linked to the logic node. These values are associated with a probability that depends on the probability distribution of its input. To estimate the probability of the values, we assume that the input variables are independent (as it is the case in a poly-tree), so we multiply them. 3 REPRESENTING MUSIC The proposed approach is illustrated with the representation of counterpoint rules and notes with different a priori probability distributions. Different relations in harmony can be easily represented in logic. These include the classification of intervals (distances in height between two notes) into: consonance and dissonances. Unison, fifth and octave are perfect consonance while third (mayor and minor) and sixth (mayor and minor) are imperfect consonance. Second (mayor and minor), fourth, augmented fourth, diminished fifth and seventh (mayor and minor) are dissonances. These are the elements which account for all harmony in music. The purpose of harmony is to give pleasure by variety of sounds through progressions from one interval to another. Progression is achieved by motion, denoting the distance covered in passing from one interval to another in either direction, up or down. This can occur in three ways: direct, contrary and oblique: * direct motion: results when two or more parts ascend or descend in the same direction * contrary motion: results when one part ascends and the other descends, or vice versa. * oblique motion: results when one part moves while the other remains stationary These concepts can be used to define counterpoint rules [1]. In particular the counterpoint rules of 1st species are defined as follows:

Page  00000003 First rule: from one perfect consonance to another perfect consonance one must proceed in contrary or oblique motion. Second rule: from a perfect consonance to an imperfect consonance one may proceed in any of the three motions. Third rule: from an imperfect consonance to a perfect consonance one must proceed in contrary or oblique motion. Fourth rule: from one imperfect consonance to another imperfect consonance one may proceed in any of the three motions. The first rule can be expressed in logic (using Prolog notation) as follows: counterrulel([Notel,Note2], [Note3,Note4]): - interval(Notel,Note3,Interl), interval(Note2,Note4,Inter2), detinter(Notel,Note2,Inter3), detinter(Note3,Note4,Inter4), inter_classl (Notel,Note2,valid,Inter3), inter_classl(Note3,Note4,valid,Inter4), inter_class2 (Notel,Note3,Interl,perf_cons), inter_class2 (Note2,Note4,Inter2,perf_cons). where: * interval(Notel, Note2, Interval): describes the interval between two notes from different voices * det_inter(Notel, Note2, Interval): describes the interval between two notes of the same voice * interclassl (Notel,Note2, Valid,Interval): describes if two notes from the the same voice have a valid/invalid interval. Where valid intervals can be consonance or dissonances which follow the same tonality (i.e., a 7th. or augmented 4th. would be invalid) * interclass2 (Notel,Note2, Inter, Conso): describes if two notes of different voices form a perfect or imperfect consonance or a dissonance The particular notes used for a composition satisfying counterpoint rules depends on the preferences of the composer. The counterpoint rule (logic node) defined above, can be linked to two stochastic variables (musical voices of two notes each) expressing the preferences of a composer. Circles are used to denote stochastic variables and squares to denote logic nodes (see figure 1). Let us suppose that the composer considers only three possible values (pair of notes) for each voice with different probability distributions reflecting his preferences, given as follows: Figure 1: An example combining Bayesian networks with logic. P(voicel = [d4,f4]) = 0.4, P(voice2 = [a4,a4]) = 0.5 P(voicel = [d4,a4]) = 0.3,P(voice2 = [d5,c5]) = 0.3 P(voicel = [d4,e4]) = 0.3,P(voice2 = [d5,g4]) = 0.2 To obtain the probability distribution of counterrulel (binary stochastic variable) given the relation described above in the logic program, we need to evaluate the predicate counterpoint for all the possible combinations of values of voicel and voice2. This is done with stochastic simulation procedure, in which the logic node evaluation is weighted by the input variables probabilities. For this example, there are 5 combinations that satisfy the predicate: Pr(voicel=[d4,f4])= 0.4, Pr(voice2=[a4,a4]) = 0.5, Pr(CRl=true) in this case is 0.2 Pr(voicel=[d4,a4]) = 0.3, Pr(voice2=[a4,a4])= 0.5, Pr(CRl=true) in this case is 0.15 Pr(voicel=[d4,a4]) = 0.3, Pr(voice2=[d5,c5]) = 0.3, Pr(CRl=true) in this case is 0.09 Pr(voicel=[d4,e4]) = 0.3, Pr(voice2=[d5,c5]) = 0.3, Pr(CRl=true) in this case is 0.09 Pr(voicel=[d4,e4]) = 0.3, Pr(voice2=[d5,g4]) = 0.2, Pr(CRl=true) in this case is 0.06 For the rest of the possible values (4) counterrulel (CR1) is false. The probability distribution of counterrulel in this example is: P(counterrulel=true) = 0.59 P (counterrulel=false)= 0.41 Bayesian network can reason from any set of known variables and evaluate the probability distributions of the rest of the variables. Logic programs (pure Prolog) can sometimes work in this way. For instance, with the above definition of counter_rulel, we could estimate the most probable values of the notes that fulfill the counterpoint rule. Knowing, for instance that there is a piece which satisfies counterpoint rules and the notes of one voice we could estimate the possible values of the other voice. Assuming a finite set of possible values for the voices, we can also estimate all the possible values of voices that satisfied the predicate counterrulel.

Page  00000004 Figure 2: An example combining Bayesian networks with logic. The above approach has several benefits: (i) the inference mechanism employed for the stochastic variables (nodes) and for the logic nodes are kept independent of each other, allowing efficient reasoning mechanisms, (ii) Bayesian networks can include with this approach background knowledge and relational concepts, which can facilitate temporal reasoning with uncertainty in music. Alternatively, a logic node can be converted into a stochastic variable within a Bayesian network following a simple procedure. The predicate of the logic node is evaluated for all the possible values of its arguments and a conditional probability matrix is constructed with I's in the combination of values where the predicate evaluate to true and O's otherwise. In the above example and considering all possible values of the voices, our matrix will have all the possible combinations of voices that satisfy the counterpoint rule predicate with I's and the rest of the combinations with O's. The above approach can be extended to include, for instance, rhythm preferences by the user. Different voices can be related with different rhythms, which can be represented in our approach with two additional nodes: one stochastic node which captures rhythm preferences and logic node that relates rhythms with voices (see figure 2). An interesting feature of this representation, is that each network can be grouped into a single "macro"node and we can combine different "macro"-nodes with higher level logic relations. 4 CONCLUSIONS AND FUTURE WORK Many applications require the ability to reason with uncertainty. Bayesian networks has been established as a common and powerful tool to treat domains with uncertainty. However, due to its propositional framework, it is difficult to express relations between variables. In this paper we described an approach where Bayesian networks are augmented with logic nodes. These nodes can have arbitrary logic programs in the form of Horn clauses expressing relations between variables. Bayesian nodes "see" logic nodes as binary variable with associated probabilities that express whether the relation in the logic node is true or false (depending on its input values). The inference mechanism employed for the stochastic variables (nodes) and for the logic nodes are kept independent of each other, allowing efficient reasoning mechanisms. With the proposed approach, Bayesian networks can include background knowledge and relational concepts, which can facilitate temporal reasoning with uncertainty in music. The proposed representation was illustrated with examples of counterpoint rules but can be easily extended to other musical knowledge. As part of our future work, we plan to construct a large network with musical knowledge that could assist a composer in his/her work. We also planning to extend the approach to multi-connected Bayesian networks and to explore algorithms to learn this type of networks from data and background knowledge. References [1] Mann, A. Study of Counterpoint from Johann Joseph FUX's Gradus ad parnassum. W.W.Norton & Company, 1971. [2] J. Y. Halpern, Analysis of First-Order Logics of Probability, Artificial Intelligence 46: 311-350, 1990. [3] D. Koller, J. Y. Halpern, Irrelevance and Conditioning in First-Order Probabilistic Logic, in Proceedings of the Thirteenth National Conference on Artificial Intelligence (AAAI-96), pp. 569-576, 1996. [4] D. Lin, A Probabilistic Network of Predicates,. In Proceedings of the eight conference: Uncertainty in Artificial Intelligence (UAI-92), pp. 174-181, 1992. [5] R. Neapolitan, Probabilistic Reasoning in Expert Systems, Addison-Wesley, 1990. [6] J. Pearl, Probabilistic reasoning in intelligent systems, Morgan, 1988.