An alternative method to prevent this processing distortion in spectral subtraction is to apply over-subtraction [Berouti,83] to the degraded signal spectrum. According to the estimated SNR, more than the average noise spectrum has to be subtracted. This leads to a strong attenuation of small signal components. If the factor of over-subtraction is high enough, the musical noise will be completely eliminated, but audible distortions in the audio signal can be generated. The noise suppression rule proposed by Ephraim and Mallah ([Ephraim,84],[Ephraim,85]) allows significant noise reduction without causing musical noise. This is mainly due to using a non-linear time-averaged SNR [Cappe,94], which exhibits a lower variance than a SNR estimated without averaging. Furthermore, the masking properties of the human auditory system can be used to reduce processing distortions (e.g. [Tsoukalas,93]). The proposed method combines spectral over-subtraction with several smoothing strategies in both in time and in frequency domain to reduce the SNR variance. This leads to little audible processing distortions. Section 2 discusses the methods used within the denoising scheme. In section 3, the new noise reduction filter is described. 2 Review of Denoising Filter Methods 2.1 Spectral Subtraction Berouti et al., in [Berouti,83] present a variant of the spectral subtraction method for speech enhancement. An overestimate of the noise magnitude or power spectrum is subtracted from the degraded signal spectrum. In each short-time frame m, the amount of over-subtraction depends on the SNR of the incoming degraded signal. A subtraction factor a(m)> 1 is determined in order to reach good noise reduction and little processing distortions. If the resulting denoised signal spectrum IS(m,k)l falls below a minimum level, it is replaced by a scaled version of the input spectrum plX(m,k)l. The scaling parameter P determines the residual noise floor after re-synthesis. The denoising algorithm is expressed through the following relationship: IS(m,k) = ( IX(m,k)Ib - a(m) ID(k)lb )1b 1 and IS(m,k)- IS(m,k)l, for IS(m,k)l > lX(m,k)l, 2 PIX(m,k)l, else. p is set to 0 < p << 1. The remaining noise floor can be used to mask the generated musical noise. Thus the setting of the two parameters a and p determines the tradeoff between the amount of residual broadband noise and the level of perceived musical noise. For a fixed value of P, increasing the value of a reduces both the broadband noise and the musical noise. Increasing a above a certain limit leads to audible distortions because of the strong attenuation of small signal components. We adopt this denoising algorithm but use a subtraction factor C(m,k), which is calculated for each frequency bin
Top of page Top of page