Centre for

Speech Enhancement Tutorial - Noise Estimation

4.1 Noise Estimation

In many speech enhancement algorithms, the first step is to estimate the power spectrum of the noise. To do this, it is necessary to make use of prior knowledge about differences between the characteristics of noise and speech. Common assumptions are:

  1. The short-time power spectrum of noise is more stationary than that of speech
  2. Within a narrow frequency band, the speech energy frequently falls to a low value
  3. The frequency of periodic noise sources changes very slowly with time; this is in contrast to voiced speech whose period changes more rapidly.

The estimation of the noise is almost always performed in a spectral or related domain for several reasons: speech and noise are partially separated in the spectral domain; spectral components of both speech and noise are somewhat de-correlated; psycho-acoustic models are conveniently applied in this domain. Thus the following domains, all possessing different advantages, are used:

In each of these domains, the coefficients are most frequently taken to be Gaussian and uncorrelated; these assumptions are rarely well substantiated. A well-presented evaluation of several noise spectrum estimation techniques is given in Ris and Dupont [2001] who found that the best performance of the tested algorithms was given by a combination of minimum statistics and harmonic filtering.

Previous | Next