Spectral Leakage in the DFT

One of things that was mentioned when we went through the DFT and explaining what it was, was that the sampled input signal becomes periodic. So what does this mean?

First, it means you have to be careful how you interpret the results, but just as importantly it can distort the frequency response. This happens because the DFT assumes that the input signal is periodic, so if the input sequence finishes on a whole number of periods everything is fine. But if your input sequence is a sine wave oscillating at 8000 times a second, unless you calculate the exact number of samples to give you an exact period, then the odds on you getting the right amount are very slim. What happens then is that a discontinuity occurs, the signal value suddenly jumps. The input sequence that the DFT "sees" for a sequence consisting of one and a half periods of a sine wave is shown in the image below.

This is just for a regular sine wave, imagine a signal that is not predictable there is no way that you will ever get an exact match in your input sequence.

So what happens in the frequency domain when you get these discontinuities? In the case shown above, with a single sine wave, you would normally expect to find a single sample of value 1 at the frequency of the sine wave, if and only if this value lies directly on a frequency bin.

Frequency Bins

An N point sequence will produce an N point frequency response, each of these points
is called a frequency bin. Exactly where these frequency bins lie in terms of actual values (e.g. 4356 Hz) depends on the number of points you have in your sequence, as well as the sampling frequency. The more points, the closer the bins lie together, the higher the sampling frequency the further apart the bins are.

In order to sample signals with very high frequencies you need a very high sampling rate. But to get useful resolution, you need to increase the number of your samples. This is an example of a classic engineering compromise, do you have good resolution or have masses and masses of data that take up your disk space? Its up to you and what you want it for.

Leakage between Frequency Bins

If we take the sine wave example, you would expect to see just one sample at the frequency of the signal. However, if we get this discontinuity then the frequency of the signal won't be lying on any bin, it falls between the two bins closest to it. Since the response is discrete we can't have a frequency of "150Hz and a bit". What happens is that the energy from the sample leaks out to the surrounding frequency bins.

An ideal frequency response.

A frequency response that has suffered leakage

Note that the magnitude of each of the samples is less than the magnitude of the ideal sample.

Since the energy has leaked out of the samples to the other bins, so the amplitude of the peak will be less than the amplitude of the original frequency sample (i.e. < 1). But if you add the squares of all the amplitudes (to find how much energy they have) they should all add up to the original amplitude (energy) (i.e. = 1)

To complicate matters further, if you increase the number of samples to give you greater resolution, the largest peak will get larger, indicating that it is nearer the true frequency, but the side samples where the energy has leaked out to, will get larger as well, making the response not as accurate as it should be.

These images show the frequency response of a sine wave, sin(pi/4), both real and imaginary parts (remember that DFT response is often complex).

The first one is of a sine wave where the input sequence consists of FOUR complete periods.
The second example is a sine wave where the input sequence consists of 4.4 periods (not a whole number of periods).

The First Sine Wave

Real Freq. Response

Imaginary Freq. Response

In the first example the frequency response is purely real, and has two peaks, one at 4 (pi/4) and at 28 (the DFT gives symmetric results)

The Second Sine Wave

The second example we can see how the energy has leaked into the nearest bins in the real frequency response. It also shows us that the sample doesn't have to be all positive, negative values are just as valid.

Real Freq. Response

Imaginary Freq. Response

The next point is that the Fourier transform has developed a non-zero imaginary frequency response. The best way to explain describe it is that in the first example the imaginary freq response was there all along, it's just that when the samples occur, it always has the value of ZERO. BUT when the samples occur in the second example (remember that we went through the bit about freq resolution on this page ?) the frequency response isn't ZERO and so we now see the imaginary response in the discrete world.

If that last bit has left you a little bit confused then don't worry about it, just as long as you grasp the idea of the energy leaking between bins.

On to Windowing or back to DFT Properties

or back to DFT Contents or back to Main Contents