streamlit-webrct doesn't work with faster-whisper model #252

aqiao · 2024-07-21T05:20:07Z

Hi, i'm working on voice translator app, and trying streamlit-webrtc with faster-whisper model
I noticed this link :https://github.com/whitphx/streamlit-webrtc/blob/main/pages/10_sendonly_audio.py
and found there is a while loop to handle audio data like below:

while True:
    if webrtc_ctx.audio_receiver:
        try:
            audio_frames = webrtc_ctx.audio_receiver.get_frames(timeout=1)
        except queue.Empty:
            logger.warning("Queue is empty. Abort.")
            break

        sound_chunk = pydub.AudioSegment.empty()
        for audio_frame in audio_frames:
            sound = pydub.AudioSegment(
                data=audio_frame.to_ndarray().tobytes(),
                sample_width=audio_frame.format.bytes,
                frame_rate=audio_frame.sample_rate,
                channels=len(audio_frame.layout.channels),
...

Doesn't it block UI thread since the while loop is in UI thread in my point view, and once i use below code, the page is pending.
Below is my code, please help to figure out how to fix it

webrtc_ctx = webrtc_streamer(
    key="speech-to-text",
    mode=WebRtcMode.SENDONLY,
    audio_receiver_size=10240,
    rtc_configuration={"iceServers": [{"urls": ["stun:stun.l.google.com:19302"]}]},
    media_stream_constraints={"video": False, "audio": True},
)

status_indicator = st.empty()
text_output = st.empty()
stream = None

while True:
    if webrtc_ctx.audio_receiver:
        sound_chunk = pydub.AudioSegment.empty()
        try:
            audio_frames = webrtc_ctx.audio_receiver.get_frames(timeout=1)
        except queue.Empty:
            time.sleep(0.1)
            status_indicator.write("No frame arrived.")

        status_indicator.write("Running. Say something!")

        for audio_frame in audio_frames:
            sound = pydub.AudioSegment(
                data=audio_frame.to_ndarray().tobytes(),
                sample_width=audio_frame.format.bytes,
                frame_rate=audio_frame.sample_rate,
                channels=len(audio_frame.layout.channels),
            )
            sound_chunk += sound
        print(sound_chunk)
        if len(sound_chunk) > 0:
            sound_chunk = sound_chunk.set_channels(1).set_frame_rate(44100)
            buffer = np.array(sound_chunk.get_array_of_samples())

            segments, _ = WhisperModel(model_size
                                       , device=f"{'cuda' if supports_gpu else 'cpu'}"
                                       , compute_type=compute_type).transcribe(buffer)
            transcript = " ".join(segment.text for segment in segments)
            print(transcript)
            text = " ".join(segment.text for segment in segments)
            text_output.markdown(f"**Text:** {text}")
    else:
        status_indicator.write("AudioReciver is not set. Abort.")

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

streamlit-webrct doesn't work with faster-whisper model #252

streamlit-webrct doesn't work with faster-whisper model #252

aqiao commented Jul 21, 2024

streamlit-webrct doesn't work with faster-whisper model #252

streamlit-webrct doesn't work with faster-whisper model #252

Comments

aqiao commented Jul 21, 2024