AudioSync Realtime Smoothing? #214

jdtsmith · 2025-01-12T15:21:44Z

From this discussion it was noted that there are a few problems with network AudioSync generation and transmission:

latency - our audio processing code needs a continuous data stream from the audio source. Any gap, network hickup, or out-of-order packet will be visible as random noise in led effects
network latency - we found that the wifi protocol sometimes delivers a bunch of UDP packets in bursts, with 100ms (up to 250ms) between bursts. This would be a major problem when directly transmitting audio samples.

For network (non-mic) sources of AudioSync packets (e.g. from PC or other packet generator), we had discussed on Discord some years back adapting my old RealTime smoothing algorithm to treat AudioSync packets instead of the realtime pixel stream. See the notes for all the details.

The approach would be:

Configure your audio playback to be "behind" the stream by a tunable amount, based on average network latency (say 50-200ms).
Update the audio reactive code to use a small ring buffer to "play back" those 44 byte packets at regular cadence (cost: being slightly delayed half the average ring-buffer size). 5 or 6 frame depth is usually fine.

The latter is already quite well tuned and was working very well for the realtime protocol. Audio sync packet sizes are even smaller, so the ring buffer should be pretty trivial.

How complicated do you think it would be to adapt this into audio_reactive.h.

The text was updated successfully, but these errors were encountered:

troyhacks · 2025-01-30T00:03:18Z

I'm going with "no" on this one - not that it isn't smart (it is), but because of inherent delays and the fact that most people can't arbitrarily delay their audio playback.

(Also in my uses of WLED, the audio can't be delayed as it's usually a DJ playing live - and even small delays really mess with your brain when you're mixing. Or we need to delay just the house audio going into WLED which is also a hassle.)

In my opinion, the latest packet to arrive is always the freshest packet assuming they arrived in order. If they aren't in order, we have the "check sequence" feature to take care of those.

The "use the latest packet" works really well in my setups, as I found my Art-Net code was sometimes causing a buffer effect with the Art-Net framerate was lower than the audio framerate of ~85 FPS. Given that we're always transmitting audio packets at ~85 FPS, many times larger displays will be using less than 85 packets per second when rendering anyway, so we should have more packets than we need - meaning the latest one to arrive is also the best.

jdtsmith · 2025-01-30T00:48:57Z

I was asking how complicated it would be; I understand it wouldn't be of interest to everyone. This would obviously be optional, and only for people for whom the increased playback fidelity was worth the extra latency.

troyhacks · 2025-01-30T11:43:37Z

I was asking how complicated it would be; I understand it wouldn't be of interest to everyone. This would obviously be optional, and only for people for whom the increased playback fidelity was worth the extra latency.

Yeah, get that - and it's a great idea - but I'm struggling to see where "playback fidelity" would actually be noticeable when we're rendering frames of an audio effect.

Also to be fair to the average person, during testing I didn't realize I was seeing the buffer effect (up to 2 seconds delay) and I felt rather stupid as "accurate audio reactivity" is something I'm working on every day. It wasn't until I paused the music one day while looking up and saw GEQ continue to display almost 2 seconds of frames... oh, that's not good. 🤣

Then again, more of the music I listen to and test with is "boots-and-cats" so... pretty it's repetitive. 🥾🥾&😺😺&🥾🥾&😺😺..

troyhacks · 2025-01-30T12:01:41Z

Last.Packet.Test.mp4

This is the "last packet" function running.

Video sound is direct input to my phone from before WLED, so this is "as heard" with no delays, at the same time WLED is processing it.

The sampling WLED instance is on WiFi using an ESP32 with a WM8978 for I2S line-in. Dynamics are 1ms rise and 250ms fall.

The audio board is then sending the sync packets over WiFi to an ESP32 Ethernet board (Dynamics are turned off here), which is rendering the GEQ effect and then sending Art-Net output over Ethernet to a WiFi connected ESP32 which is displaying the pixels., which is what I captured on video. I'm using a line-in from my audio mixer to my phone to keep everything synced so you're hearing and seeing an accurate representation of the performance.

So kinda almost the worst case of audio and pixel data moving around 3 boards, but I think it's extremely accurate to the music being played - and accurate to my general use cases of audio sync and my device topology.

troyhacks · 2025-01-30T12:04:37Z

I picked that song as the mix isn't too cluttered - I think the snare hits are represented in the high end quite nicely, and you can see the piano notes with their corresponding sustains in the mids very accurately.

I was also pausing the music randomly a few times to show there's no buffer effect and it picks back up immediately.

jdtsmith · 2025-01-30T12:43:38Z

Probably beat detection effects would be easiest to spot. A 60Hz beat that plays back at 40-80Hz (randomly in judders and spurts) would be noticeable. The out of order packet rejection probably helps here some, but in my testing there can be 1/2 or even 1s "packet droughts" followed by a bunch of UDP packets in quick succession.

Network audio playback protocols solve this by agreeing on a delay, buffering and adding an exact "play at this time" timestamp with device clocks synchronized with NTP or similar. That's overkill for WLED, but a simple buffer + smooth algorithm is easy and would help a lot.

It would be specific to network sources.

jdtsmith added the enhancement New feature or request label Jan 12, 2025

jdtsmith mentioned this issue Jan 19, 2025

Great start, but not 100% protocol compatible chrisgott/feed_my_wled#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AudioSync Realtime Smoothing? #214

AudioSync Realtime Smoothing? #214

jdtsmith commented Jan 12, 2025 •

edited

Loading

troyhacks commented Jan 30, 2025

jdtsmith commented Jan 30, 2025

troyhacks commented Jan 30, 2025

troyhacks commented Jan 30, 2025

troyhacks commented Jan 30, 2025 •

edited

Loading

jdtsmith commented Jan 30, 2025

AudioSync Realtime Smoothing? #214

AudioSync Realtime Smoothing? #214

Comments

jdtsmith commented Jan 12, 2025 • edited Loading

troyhacks commented Jan 30, 2025

jdtsmith commented Jan 30, 2025

troyhacks commented Jan 30, 2025

troyhacks commented Jan 30, 2025

troyhacks commented Jan 30, 2025 • edited Loading

jdtsmith commented Jan 30, 2025

jdtsmith commented Jan 12, 2025 •

edited

Loading

troyhacks commented Jan 30, 2025 •

edited

Loading