ï~~Proceedings of the International Computer Music Conference (ICMC 2009), Montreal, Canada
August 16-21, 2009
MIDIVIS: VISUALIZING MUSIC STRUCTURE
VIA SIMILARITY MATRICES
Jacek Wolkowicz, Stephen Brooks, Vlado Keselj
Dalhousie University
Faculty of Computer Science
ABSTRACT
This paper presents a technique for visualizing
symbolically encoded music stored in MIDI files. The
method is automatic and enables visualizing an entire opus
in a single image. The resulting images unveil the structure
of a piece as well as detailed themes' leading within a
piece. The technique proposed in the paper is suitable for
many types of music (both classical and popular) and the
quality of the visualization highly depends on the quality
of input MIDI file. The program for creating visualizations
using this technique and previewing them with audio
playback is made available for use within the community.
1. INTRODUCTION
Music visualization systems work with two types of data -
raw recordings and various forms of sheet music. There are
also two different target groups of such visualizations -
untutored audiences and musical experts. The former's
needs are quite simple - provide them with a solution that
follows the music in some way. Such visualization systems
are present in multimedia players. On the other hand,
professional visualizations are designed to convey special
information to users with a proficiency in the music
domain. Those include various ways of presenting
waveforms for sound engineers. The other approaches
incorporate symbolic music representations such as sheet
music for music performers to better understand a given
opus. The solution presented in this paper is designed for
both music performers to help them understand the
structure of a piece, and laymen, to track the flow of music.
2. PREVIOUS WORK
This work builds upon the central concept presented in
paper by Foote [2]. In this approach a raw music recording
is taken as input data. A visualization is organized in a
rectangular image where each pixel (at positions i and j)
expresses the audio similarity that results from cepstral
analysis of two corresponding excerpts (frames) of the
piece (Figure la) at time i and j. Cepstral analysis is proven
to simulate the human perception of audio signals; i.e.
fragments that sound similarly for humans tend to have
similar cepstral coefficients. Organizing them in a
rectangular shape allows tracking dependencies in a music
piece.
ab
Figure 1. Bach prelude C major- an excerpt visualized
using three methods.
Using a single channel recording as input data simplifies
the problem since there is just one concurrent object to
compare at each time frame. However, if one considers an
actual composition - there are usually several separated
channels of musical information. Using recordings - one
cannot separate those logical channels and important
details may remain hidden. J. Foote presented a sample
similarity matrix that result from analyzing a MIDI file [2],
but his simplifications in this area remain significant: one
channel with one note compared at a time. Symbolic
representations hides all the performance-dependent
features but carry the entire structural information and
incorporating this information will be addressed in the
presented solution. An output result of the proposed
method is presented in Figure lb and it will be explained in
further sections.
Symbolic representation is usually a better form for
analysis even though incorporating it leads to the problem
of conversion from a recording to sheet music. This was
shown in the paper describing the ImproVis system [4]
where the manual transcription of recorded performances
practically prohibited a wide application of the method.
The most practical approach is therefore the
visualization of existing, symbolically coded music, such
as MIDI files. However, most of the existing MIDI
visualization systems either simplify the visualization
problem by just modifying western music notation in order
to add some other visual features (colour, line thickness)
53