Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(feature request) translate effect input "bins" to corresponding frequencies #190

Open
MaxMSchneider opened this issue Nov 20, 2024 · 4 comments
Labels
backburner low priority compared to other issues enhancement New feature or request

Comments

@MaxMSchneider
Copy link

MaxMSchneider commented Nov 20, 2024

Is your feature request related to a problem? Please describe.
I have been working with the atuline WLED SR https://github.com/atuline/WLED (archived) so far, putting up 150ft of strips on different music events. In order to precisely set up certain sound reactive effects, such as the Freqwave and Freqmatrix, it turned out to be super finnicky to get the high and low bin set up properly in order to dial in the effect as desired (depending on the music that is being played).

Describe the solution you'd like
Instead of just having numbers for the high and low bin, it would be nice to translate them over to the actual frequencies (Hz) they are referring to, at least in the UI. This way the setup of the effects would be much more straight forward. This would also be handy for effects that have a bin select, such as Puddlepeak for example.

I am aware that this calculation would require the sample rate of the input signal, maybe there's a way to determine the rate from the input (in my case an I2C ADC) automatically? Even the option to manually enter the sample rate for calculation in the Usermod Settings would suffice.

An alternative would be to have some sort of a cheat sheet in the SR wiki. That would be a start as well.

Thank you guys!

@MaxMSchneider MaxMSchneider added the enhancement New feature or request label Nov 20, 2024
@ewowi
Copy link

ewowi commented Nov 20, 2024

Small correction, it was https://github.com/atuline/WLED

Not aircoookie 🙂

@MaxMSchneider
Copy link
Author

@ewowi You're right, my bad!

@softhack007 softhack007 added the backburner low priority compared to other issues label Nov 21, 2024
@troyhacks
Copy link
Collaborator

troyhacks commented Dec 8, 2024

@MaxMSchneider it's not quite so simple to give the exact frequencies, because there's a lot of manual "fudging" going on depending on how you select your options in AudioReactive.

TL;DR: The specific AudioReactive code you're running is your only cheat sheet as we are not working "in frequencies".

If that's enough explanation, you can stop here and go look at the code in audio_reactive.h

I'll try to explain the general idea which might help form a deeper "cheat sheet"...

We are reading a 22050 Hz sampled signal, which is then fed into an FFT and binned into 512 frequency steps, which would be:

step_num * samplerate / total_steps = fft_bin_frequency

If you calculate that out, you see 512 bins in 43.06640625 Hz steps, representing 0..22050 Hz range across 512 steps. (You can also refactor that to give you the bin number for a particular input frequency, which you may find useful at some point)

Simplified, 22050 Hz / 512 steps = ~ 43 Hz

Nyquist says your sample rate is 2x your actual useful capture rate, so we can disregard the upper half of the FFT bins - which leaves us with 256 bins representing between 0 Hz and ~11000 Hz.

This also means our lowest useful bin is around 43Hz as our first FFT bin is 1...43Hz and that isn't likely to be "good" audio - so our FFT is useful for frequencies around 43 Hz to 11000 Hz. This means we're now using 255 bins as we drop the lower one and everything above 11025 Hz as that's half our sample rate (the Nyquist frequency)

Now... there's a "proper" way to combine and average these 255 bins down to the 16 bands we present to WLED... which is very standardized for audio gear. We don't do that. 🤣

If we considered the standardized calculation for 16 bins around 20...11000Hz our 16 ideal bins would be approximately:

30, 44, 65, 97, 144, 213, 316, 469, 696, 1032, 1531, 2271, 3370, 4999, 7415, 11000 Hz

Now you might see a problem here, as our FFT calculation is 43 Hz steps and our idealized frequency bins aren't small enough in the lower frequencies so they don't overlap.

So let's try 45...11000Hz:

57, 81, 115, 163, 231, 329, 467, 663, 942, 1339, 1902, 2701, 3837, 5451, 7743, 11000 Hz

Ok, that's a little better, but we're still below our 43Hz step at the low ranges.

Now let's try 95...10000 Hz, raising the low end and lowering the high end, because we're centering anyway:

127, 170, 227, 304, 407, 545, 729, 975, 1304, 1744, 2334, 3122, 4177, 5587, 7475, 10000

Ok, this is good - the difference between 127 Hz and 170 Hz is 43 Hz, which is our FFT step - but if we use that as a centering calculation, either we're combining some low-end bins of 43 Hz and 86 Hz centers (not ideal) or dropping some (also not ideal as there's real data there).

So... what does WLED use? Well... approximations and a bunch of fudging.

In a situation where we're not attempting to filter for poor quality mics and whatnot, the basic calculation is:

fftCalc[ 0] = wc * fftAddAvg(1,1);               // 1  bin  :   43 Hz - 86 Hz   : sub-bass
fftCalc[ 1] = wc * fftAddAvg(2,2);               // 1  bin  :   86 Hz - 129 Hz  : bass
fftCalc[ 2] = wc * fftAddAvg(3,4);               // 2  bins :  129 Hz - 216 Hz  : bass
fftCalc[ 3] = wc * fftAddAvg(5,6);               // 2  bins :  216 Hz - 301 Hz  : bass + midrange
fftCalc[ 4] = wc * fftAddAvg(7,9);               // 3  bins :  301 Hz - 430 Hz  : midrange
fftCalc[ 5] = wc * fftAddAvg(10,12);             // 3  bins :  430 Hz - 560 Hz  : midrange
fftCalc[ 6] = wc * fftAddAvg(13,18);             // 5  bins :  560 Hz - 818 Hz  : midrange
fftCalc[ 7] = wc * fftAddAvg(19,25);             // 7  bins :  818 Hz - 1120 Hz : midrange -- 1Khz should always be the center !
fftCalc[ 8] = wc * fftAddAvg(26,32);             // 7  bins : 1120 Hz - 1421 Hz : midrange
fftCalc[ 9] = wc * fftAddAvg(33,43);             // 9  bins : 1421 Hz - 1895 Hz : midrange
fftCalc[10] = wc * fftAddAvg(44,55);             // 12 bins : 1895 Hz - 2412 Hz : midrange + high mid
fftCalc[11] = wc * fftAddAvg(56,69);             // 14 bins : 2412 Hz - 3015 Hz : high mid
fftCalc[12] = wc * fftAddAvg(70,85);             // 16 bins : 3015 Hz - 3704 Hz : high mid
fftCalc[13] = wc * fftAddAvg(86,103);            // 18 bins : 3704 Hz - 4479 Hz : high mid
fftCalc[14] = wc * fftAddAvg(104,164) * 0.88f;   // 61 bins : 4479 Hz - 7106 Hz : high mid + high with slight damping
fftCalc[15] = wc * fftAddAvg(165,215) * 0.70f;   // 50 bins : 7106 Hz - 9259 Hz : high with some damping

So there's your "cheat sheet" for that setup - but options where we're using mic filtering, it's different. If you enable "right shift" then it's different again. You can see this in the code in audio_reactive.h with comments. We're also using dampening in places and also not going all the way to our maximum useful sample rate of 11025 Hz. FFT Windows will also factor into how the FFT is calculated but you're on your own there to find out how.

(I'm also thinking that this might not be exactly the frequency ranges but rather the center of the bin, which makes sense to me but I'm also not an FFT expert - so "43 Hz" could be 21.5 Hz to 64.5 Hz but likely is more sensitive towards 43 Hz. Maybe. I dunno.)

@troyhacks
Copy link
Collaborator

troyhacks commented Dec 8, 2024

The future of AudioReactive will likely be migrated to use the ESP-DSP code as that's optimized for the fastest FFT calculations on the ESP32 family. The ESP32-S3 and ESP32-P4 both have FFT "on the chip" they are extremely fast and require very little CPU.

ESP-DSP also gives us easy access to filters, allowing us to add High-Pass (which filters out low end frequencies) and Low-Pass filters (which filter out high end frequencies) to the code that are very low on CPU use. Ideally a clean FFT should have these filters to cut frequencies that it can't understand.

Faster also allows us to have FFT calculations with more steps, potentially providing better resolution at the lower end.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backburner low priority compared to other issues enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

4 participants