A deep learning project that combines Vision Transformers (ViT) and GPT-2 for image-to-text generation, specifically designed for OCR (Optical Character Recognition) tasks. This project uses a vision-language model architecture to generate textual descriptions from images.
-
Notifications
You must be signed in to change notification settings - Fork 1
gmission-official/OCR-Transformers
Folders and files
Name | Name | Last commit message | Last commit date | |
---|---|---|---|---|
Repository files navigation
About
No description, website, or topics provided.
Resources
Stars
Watchers
Forks
Releases
No releases published
Packages 0
No packages published