Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some processing steps maybe not pipelined #157

Open
yump opened this issue Aug 29, 2024 · 1 comment
Open

Some processing steps maybe not pipelined #157

yump opened this issue Aug 29, 2024 · 1 comment
Labels
enhancement New feature or request

Comments

@yump
Copy link

yump commented Aug 29, 2024

When transcribing an hour of opus audio with either WhisperCPP Tiny or FasterWhisper Tiny, my CPU utilization looks like this:

image

There is lots of idle CPU time there. According to the inserted statistics, FasterWhisper is going at something like 25x speed (61500 ms / 2414 ms). Is there some inherently serial part of the process that's slower than 25x?

ffmpeg -i file.opus -f null - reports that the audio can be decoded at 430x speed. So it doesn't seem like decoding should be a bottleneck

@mkiol
Copy link
Owner

mkiol commented Aug 30, 2024

Hi, thanks for noticing this.

Periods of low CPU usage are most likely related to VAD processing (Voice activity detection). Currently, the STT decoder is fed with audio data only when a voice is detected. This performance degradation is due to the fact that my implementation of how to transfer data from the file reader to the VAD processor is very slow and totally ineffiecient. It needs to be rewritten.

Let's keep this issue open. I will try to solve this problem in future releases.

@mkiol mkiol added the enhancement New feature or request label Aug 30, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants