Skip to content

A utility based on neural networks for generating srt subtitles from common video formats

License

Notifications You must be signed in to change notification settings

jgurakuqi/auto-subtitles-generator

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Project Title

Subtitle Generation with Faster Whisper

Description

This project utilizes Faster Whisper to generate subtitles from audio files. It leverages ffmpeg for audio extraction and pre-processing, and Faster Whisper for generating .srt subtitle files.

Table of Contents

Important Notes

  • IMPORTANT: This project is based on Faster Whisper 1.1.1, and currently i cannot ensure the compatibility with future versions.
  • Even if available, I largely discourage using the CHUNKED-BATCHED processing of videos, as there seems to be a bug related to Whisper itself which causes a large degradation of the accuracy in correspondence of the chunks' boundaries, leading to shifted, repeated or missing subtitles. Even disabling VAD or using VAD with the same settings across BATCHED and non-BATCHED processing, the issue persists.
  • By default i keep enabled the VAD (Voice Activity Detection) filter, otherwise long silences cause evident hallucinations, with generation of completely unrelated subtitles. Unfortunately, the VAD filter decreases a bit the accuracy over the sections immediately adjacent to the silences, so you are presented with a tradeoff:
    • VAD filter disabled: hallucinations, but no loss of subtitles near silences.
    • VAD filter enabled: no hallucinations, but loss of subtitles near silences. I opted for the second option, as overall produces more accurate results. Also I used custom settings for VAD filter, with very permissive threshold and minimum silence duration, to avoid as much as possible hallucinations and at the same time avoid missing subtitles.

Installation

Prerequisites

Steps

  1. Clone the repository:
git clone https://github.com/jgurakuqi/auto-subtitles-generator
cd auto-subtitles-generator
  1. Install the required Python packages:
pip install -r requirements.txt
  1. Ensure ffmpeg is installed and accessible from the command line. You can download it from FFmpeg's official website.

  2. Follow the requirements for Faster Whisper.

Usage

  1. Work in progress: currently the whole execution is handled through the main.py file. You can run it as follows:
python main.py
  1. The script will use the parameters in the main to determine the audio extraction and subtitle generation process.

  2. At the end, the srt files will be generated in the same folder of the video files.

Planned Improvements:

The project is quite recent, and I plan of:

  • Introducing the possiblity of chosing the parameters through command line and/or yaml file.
  • Make the code more modular.
  • Provide more user-friendly output: e.g., path of the generated subtitles, ....
  • POSSIBLY integrate Audio-tracks separator model for better audio quality. The model will entirely remove the background noise and music, leaving only the voices for the model to focus on. This, in conjunction with accurate checks over the source audio, could lead to much better subtitle quality, also allowing to disable the VAD filter (e.g., check in the voices only track if there are voices, and hence remove hallucations when no voice are detected).
  • POSSIBLE IDEA: Introduce a GUI for a more user-friendly experience.

Features

  • Extract audio from video files using ffmpeg.
  • Generate subtitles using Faster Whisper.

Contributing

  • Work in progress

# Print the list of currently installed packages
pip freeze

License

This project is licensed under the MIT License. See the LICENSE file for details.

About

A utility based on neural networks for generating srt subtitles from common video formats

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages