FrameTextExtractor

FrameText Extractor is an open-source tool for optimized text extraction (OCR) from videos. It combines OpenCV, Pillow, and Tesseract to extract text from individual video frames, using multithreading to improve processing performance. It also includes a feature for text correction using a language model.

Features

Text Recognition (OCR): Extracts text from video frames using Tesseract OCR.
Optimized Video Processing: Processes frames at regular intervals (e.g., 1 frame per second) and uses multithreading for better performance.
Motion Detection: Detects changes between frames to avoid unnecessary text extraction on static frames.
Scalable Processing: Utilizes all available CPU cores for faster execution.
Flexible Customization: Allows for dynamic adjustment of frame interval, frame size, and motion detection sensitivity.
Text Correction with LLM: Corrects extracted text using the DeepSeek API a language model.

Requirements

To use this project, you'll need:

Python 3.x
OpenCV (cv2)
Pillow
Tesseract OCR (installed and available in the system path)
pytesseract
Numpy
OpenAI (DeepSeek API for text correction)

If you are using Windows, ensure that Tesseract is installed and the path is set correctly.

Installing Dependencies

You can install the necessary Python libraries with:

pip install opencv-python pillow pytesseract numpy

Install Tesseract OCR:

Windows: Tesseract Download
macOS: Install via Homebrew:
```
brew install tesseract
```
Install additional language packages:
```
brew install tesseract-lang
```
Linux: Install via your system’s package manager (e.g., apt on Ubuntu):
```
sudo apt install tesseract-ocr
```
Install additional language packages:
```
sudo apt install tesseract-ocr-[language-code]
```
Replace [language-code] with the specific code for the language you need (e.g., deu for German).

Usage

Set Tesseract Path (Windows only): Update the path in the set_tesseract_path() function if necessary:
```
pytesseract.pytesseract.tesseract_cmd = r'C:\Program Files\Tesseract-OCR\tesseract.exe'
```
Set API Key: Obtain an API key from DeepSeek and set it in the api_key variable:
```
api_key = "<DeepSeek API Key>"
```
Process Video: Place your video in the same directory or specify the path in the video_path variable.
Run the Script:
```
python frametext_extractor.py
```
The extracted and corrected text will be saved to the output file specified in output_text.

Example Code

if __name__ == "__main__":
    video_path = "video.mp4"
    api_key = "<DeepSeek API Key>"
    output_text = "corrected_extracted_text.txt"
    
    final_text = process_and_correct_text(video_path, api_key)
    
    with open(output_text, 'w', encoding='utf-8') as f:
        f.write(final_text)
    
    logging.info(f"Processing complete. Corrected text saved to {output_text}.")

How it Works

Load Video: The video is loaded, and frames are processed at regular intervals (e.g., 1 frame per second).
Resize Frames: Frames are resized to speed up processing.
Motion Detection: The script checks if the current frame differs significantly from the previous frame to avoid unnecessary OCR operations.
Text Extraction: If motion is detected, text is extracted using Tesseract OCR.
Text Correction: Extracted text is processed and corrected using the DeepSeek API and a language model.
Save Results: The corrected text is saved to a text file.

Customization

You can customize the following parameters to suit your needs:

Frame Interval: Process more or fewer frames by adjusting the interval between frames. This is done by setting the frame_interval parameter when calling the process_video_optimized function:
```
process_video_optimized(video_path, output_text, frame_interval=2)
```
This example processes one frame every two seconds (if fps = 1).
Frame Size: Adjust the scaling of the frames to influence processing time. Use the scale_factor parameter to resize frames. For example:
```
process_video_optimized(video_path, output_text, scale_factor=3)
```
Motion Threshold: Adjust the sensitivity of motion detection by changing the motion_threshold parameter. A higher threshold reduces sensitivity (i.e., fewer movements are detected), while a lower threshold increases sensitivity:
```
process_video_optimized(video_path, output_text, motion_threshold=0.1)
```

Full Example:

You can combine all these options to fine-tune your video processing:

process_video_optimized(video_path, output_text, frame_interval=1, scale_factor=2, motion_threshold=0.05)

Contributing

Contributions are welcome! Please submit a pull request or open an issue if you have improvements or find bugs.

License

This project is licensed under the MIT License – see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 27 Commits
LICENSE		LICENSE
README.md		README.md
frametext_extractor.py		frametext_extractor.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

FrameTextExtractor

Features

Requirements

Installing Dependencies

Usage

Example Code

How it Works

Customization

Full Example:

Contributing

License

About

Releases

Packages

Languages

License

zeynelacikgoez/FrameTextExtractor

Folders and files

Latest commit

History

Repository files navigation

FrameTextExtractor

Features

Requirements

Installing Dependencies

Usage

Example Code

How it Works

Customization

Full Example:

Contributing

License

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages