Jennifer Freeman
This program implements the Cooley-Tukey radix-2 fast Fourier Transform (FFT) on a WAV audio file. The user can apply modifying and mixing audio processing effects to output a modified WAV audio file. The frequency-power relationship of the input signal can also be determined when modifying effects are applied.
An FFT is as algorithm for efficiently computing the Discrete Fourier Transform (DFT) on a data set. For a vector of data
$$X[k]=\underbrace{\sum_{m=0}^{N/2-1} x[2m]e^{\frac{-2\pi i mk}{N/2}}}{\text{$x{even}$}} + \underbrace{e^{\frac{-2\pi i k}{N}}}{\text{$W_N^k$}}\underbrace{\sum{m=0}^{N/2-1} x[2m+1]e^{\frac{-2\pi i mk}{N/2}}}{\text{$x{odd}$}}$$
The first summation
This saves additional computations since each
Although the Cooley-Tukey radix-2 FFT algorithm provides some computational boost to computing the DFT individually, the algorithm will be slower for large signals. Additionally, large signals occupy more space in memory.
To account for this the program implements a windowing method for signals that contain more than
The windowing method goes as follows:
- The signal is split into
$2^{18}$ -sized chunks. Note the signal size is always a power of two, so an integer number of chunks will be obtained. - For each pair of consecutive chunks,
- Multiply the first chunk by a linear ramp. The linear ramp is a
$2^{18}$ -sized vector with the first element 0 and the last element 1, and the remaining elements are equally spaced and increasing. - Multiply the second chunk by the inverse of the linear ramp (the ramp starts at 1 and ends at 0).
- Concatenate the first chunk and second chunk into one. On this larger chunk perform the FFT, any processing effects and the inverse FFT.
- Move to the next pair of chunks (i.e. chunk two and chunk three)
- Multiply the first chunk by a linear ramp. The linear ramp is a
- Rebuild the signal by summing the inverse FFT output from each chunk. All but the first and last chunks will have had both positive and negative linear ramps applied, so the summation will rebuild the signal at the appropriate amplitude. This is often-called the overlap-add method.
The frequency-power relationship of an input signal is computed after the FFT is performed. The periodogram describes the relationship between signal frequency components and power, where power describes how often that frequency appears in the signal. Using Welch's method we can obtain the frequency-power relationship for the entire signal even when the window and overlap-add method is used [^5]. For large signals processed in chunks, the periodogram is computed by finding the average modulus of the FFT values in all chunks.
A digital filter is an operation in signal processing that alters specific frequency components within a signal 2. In an audio signal, we can apply digital filters to reduce or enhance specific frequencies.
A low-pass digital filter attenuates frequencies above some threshold frequency and allows all lower frequencies to "pass" through the filter. The simplest low-pass frequency is called a sharp cut-off filter where all frequencies above the threshold are removed and all other frequencies remain unchanged 2.
To apply the low-pass sharp cut-off filter to an audio file, an FFT is performed to produce the frequency spectrum
A high-pass digital filter is the opposite of a low-pass filter. Frequencies below a specified frequency are attenuated and all frequencies above are unchanged. For a given frequency, the index that corresponds to this frequency bin from the FFT computation can be computed as described above and all lower frequencies are zeroed out.
A band-pass filter attenuates all frequencies outside of some frequency range
Equalizing an audio signal allows for attenuation and boosting within frequency bands. Frequency bands are often preset ranges in the audible frequency range that correspond to groups of ranges that are often modified together. For example frequencies between 60 Hz to 250 Hz describe a Bass band, and equalization effects can be applied to the entire band [^3].
This program uses 10 preset frequency bands with band centres located at: 32 Hz, 63 Hz, 125 Hz, 250 Hz, 500 Hz, 1 kHz, 2 kHz, 4kHz, 8kHz, 16 kHz
The width of the each band can be computed using
To increase or decrease a frequency band by decibel value, we can compute the relative gain which equals
By computing the FFT of the signal, the gains are multiplied by the values in the frequency domain and the modified signal can be obtained by performing the inverse FFT.
Mixing effects involve incorporating multiple signals into one. This program implements two simple mixing techniques.
Two audio signals can be concatenated into one, by concatenating the second signal after the first.
Given two vectors containing the amplitudes of two audio signals, the signals can be over layed by summing the element-wise amplitude components. For large amplitudes that overflow, the signal might need to be scaled.
The program takes a variable number of arguments depending on the desired audio processing to perform.
The first two input arguments are:
- A WAV audio file name with a maximum file size of 12 MB. Example: "in_file.wav"
- The name of an output WAV audio file name. Example "out_file.wav"
The remaining arguments describe the type of audio processing to perform.
Low-pass filter
-
Low-pass filter identifier
low
. -
Frequency value between (20-20000) Hz to apply the low-pass filter.
Example command line input
FFT_audio alphabet.wav output.wav low 2000
High-pass filter
-
High-pass filter identifier
high
. -
Frequency value between (20-20000) Hz to apply the high-pass filter.
Example command line input
FFT_audio alphabet.wav output.wav high 1000
Band-pass filter
-
Band-pass filter identifier
band
. -
First frequency value between (20-20000) Hz for the lower threshold on the band-pass filter.
-
Second frequency value between (20-20000) Hz for the upper threshold on the band-pass filter. The second frequency must be larger than the first frequency.
Example command line input
FFT_audio alphabet.wav output.wav band 600 1000
Equalizing
-
Equalizing identifier
equalize
. -
Ten integer decibel values between -24dB to 24dB to apply to the ten preset frequency bands. A decibel value of 0 will apply no change to the frequency band.
Example command line input
FFT_audio alphabet.wav output.wav equalize -12 0 1 8 -3 22 0 0 3 14
To mix signals the two input WAV files must contain the same number of channels. If the sampling rate differs between the two signals, the larger sampling rate will be used in the output file.
Concatenating Signals
-
Concatenating identifier
add
. -
A WAV audio file name with a maximum file size of 12 MB to be concatenated after the first WAV file.
Example command line input
FFT_audio alphabet.wav output.wav add piano.wav
Overlapping Signals
-
Overlapping identifier
overlap
. -
A WAV audio file name with a maximum file size of 12 MB to be overlapped with the first WAV file.
Example command line input
FFT_audio alphabet.wav output.wav overlap piano.wav
The program outputs a WAV audio file named with the second input argument containing the processed input audio signal.
If modifying effects are performed the periodogram is calculated, and an output csv file is created with the array of frequency bin versus power measurements. For a one channel input audio file, the file is named periodogram.csv. For a two channel input audio file, the periodogram is computed for each channel. The files are named periodogram_leftchannel.csv and periodogram_rightchannel.csv for the left and right channels respectively.
The following is the periodogram of the "female_sing.wav" file. By viewing the signal in the frequency domain we can determine what processing effects may achieve the desired output signal.
[^3] What is Graphic EQ? Retrieved December 19, 2021, from https://www.presonus.com/learn/technical-articles/What-Is-a-Graphic-Eq
[^4] Octave band. Retrieved December 17, 2021, from https://en.wikipedia.org/wiki/Octave_band
[^5] Welch, P. (1967). The use of fast Fourier transform for the estimation of power spectra: a method based on time averaging over short, modified periodograms. IEEE Transactions on audio and electroacoustics, 15(2), 70-73.
Footnotes
-
Bekele, A. J. Advanced Algorithms. (2016). Cooley-tukey fft algorithms. Advanced algorithms. ↩
-
O'Haver, T. (2019, December). Fourier filter. Retrieved December 12, 2021, from https://terpconnect.umd.edu/~toh/spectrum/FourierFilter.html ↩ ↩2