present a brief introduction to the modal TFD as it forms
the basis for our new frequency-dependent computation.
Cohen's class of bilinear TFD's can be written as2 [2]:
C(t,; 4 2 = fs* (u - /2) s(u + /2)
472 (1)
x (,x0)e-jOt-jo+jOU dudxdO
The input signal is s(t) and the function (O6, T) is
called the kernel, described here in the (0, x) ambiguity
domain. (The (0, ) ambiguity domain is related to
the (t, o) time-frequency domain by a two-dimensional
Fourier transform (see Figure 1).) As developed by
Cohen, the kernel completely determines the properties
of the time-frequency representation.
Temporal Correlation
Domain
(t,T)
Time-Frequency%
% Z
Domain (t o))
(O, Co)
Spectral Correlation
Domain
Ambiguity
Domain
is the time-smoothed local autocorrelation function. The
notation f\_m indicates the Fourier transform from the
x -dimension to the ao -dimension. We have dropped the
scaling by 4n2 and, for brevity, will similarly drop all
scaling factors in the sequel.
For suppression of crossterms, the low-pass cutoff
frequency ofHT (0) (or its Fourier dual, hT (t) as in (4)
above) must be chosen to be smaller than the smallest
frequency separation of components in the sum-ofsinusoids signal. The cutoff frequency should, however,
be as large as possible in order to preserve temporal
detail. For a single musical note of known pitch, the
cutoff frequency is usually chosen to be slightly less than
the fundamental frequency. The modal kernel has been
used in this fashion for high-resolution analysis of piano
notes [3].
When the input signal is polyphonic, however, the
cutoff frequency must be chosen arbitrarily as the
minimum separation of partials in frequency is an
unknown quantity. Even if we had knowledge regarding
the tuning used in the music, we could only infer a
minimum partial frequency spacing as a function of
frequency. For example, if we knew that the music was
based on the Western 12-tone scale, then in the
frequency region of 115 Hz we could expect partials at
110 Hz (A2) and 116.5 Hz (A#2). The cutoff frequency
in this region could be chosen to be 6 Hz. In the
neighborhood of 900 Hz we could expect partials at 880
Hz (A5) and 932.3 Hz (A#5). The cutoff frequency in
this region would be 52 Hz and we would preserve more
temporal detail due to a smaller degree of smoothing.
Unfortunately, we cannot effect a varying degree of
smoothing since the cutoff frequency enters as a
parameter in the computation of (4) which occurs prior to
the Fourier transform in (3). We could compute several
TFD's, each with a different cutoff frequency, and then
combine the results (as proposed in [4] for the constantQ modal TFD), but this approach is computationally
demanding and unnecessary as we shall see.
3 Derivation
The modal TFD in (3) is computed by forming the
smoothed autocorrelation in the temporal correlation
domain and then taking the Fourier transform to enter the
time-frequency domain (see Figure 1). An alternative
approach is to form the smoothed autocorrelation in the
spectral correlation domain and then compute a Fourier
transform to once again end up in the time-frequency
domain. We show that this approach gives us the desired
frequency-dependent computation for the modal TFD.
We can rewrite equation (1) in an equivalent form
that introduces computation in the spectral correlation
domain [1]:
Figure 1: TFD's have equivalent representations in
four domains. The domains are related by Fourier
transforms (single arrows) or double Fourier
transforms (double arrows).
The modal TFD is characterized by the modal kernel
[4], which is given by:
ýMK(O,)= hF(x)HT(O) (2)
where HT (0) is a low-pass filter in the Fourier domain
(with corresponding time-domain impulse response
hT (t)). The function hF (T) is a time-domain window
function that truncates the infinite summation in (1) to
allow for realizable implementations. It is the low-pass
filter HT (0) that is of interest, however, as it effects the
temporal smoothing necessary to suppress cross-terms.
Substituting the modal kernel into (1) we have:
CMK(to) =F (ohF(T) R(t,; hT (t)) (3)
where
R(t,T; hT (t))
JhT(t-u)s(u+l)s*(u--)du (4)
2 Unless otherwise noted, all integrals are definite
integrals over the entire real line.
0