ocr_yolo_tesseract

I made some changes to the repo - now you can just clone it and follow below steps to do ocr -

Install tesseract from here https://github.com/tesseract-ocr/tesseract/wiki
chang pytesseract.pytesseract.tesseract_cmd = '/app/.apt/usr/bin/tesseract' in the app.py file to tesseract installed location.
change your current directory to the repo using terminal, git bash or anaconda prompt
run pip install -r requirements.txt
run python app.py
run localhost in your web browser
drag and drop an image following the instructions and voila you'll get a text output

ocr_yolo_tesseract

OCR is a technology that recognizes text within a digital image. It is commonly used to recognize text in scanned documents, but it serves many other purposes as well. OCR software processes a digital image by locating and recognizing characters, such as letters, numbers, and symbols.

I acquired the data for this task from here - https://dataturks.com/projects/devika.mishra/Indian_Number_plates -

Images in this dataset looks like -

Steps - Instructions - Please make changes to the code input locations wherever required, for images mainly.

Use Trained Model to generate text region with the help of the file generate_text_region.py from terminal you have to make changes to the location of input image. after doing this change run this code in your terminal - python generate_text_region.py

Output would look like - In you python file's directory look for licence0000.jpg

Now Use this image to input in the check skew using this command in the terminal -

deskew input.png

if skew is present use deskew --output output.png input.png

after deskewing image would look like -

Use below command to generate text as output from finale deskewed image -

python read_text.py

output would look like this -

You can convert this to text file as well using >'name.txt' in above python statement

For more information check out google colab notebook -

https://colab.research.google.com/drive/1O43GwR5VFz7-TslFiCegL0ixn00zbAl4#scrollTo=ppTaAffN-lCW

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
__pycache__		__pycache__
templates		templates
text_detected_200-objects		text_detected_200-objects
training_data		training_data
Aptfile.txt		Aptfile.txt
OCR_YOLO.ipynb		OCR_YOLO.ipynb
Procfile.txt		Procfile.txt
README.md		README.md
app.py		app.py
deskewed_200.jpg		deskewed_200.jpg
generate_text_region.py		generate_text_region.py
image_200.jpg		image_200.jpg
img_ex.jpg		img_ex.jpg
output_text.JPG		output_text.JPG
processed_200.jpg		processed_200.jpg
read_text.py		read_text.py
requirements.txt		requirements.txt
text_detected.jpg		text_detected.jpg
text_detected_200.jpg		text_detected_200.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ocr_yolo_tesseract

About

Releases

Packages

Languages

absam97/ocr_yolo_tesseract

Folders and files

Latest commit

History

Repository files navigation

ocr_yolo_tesseract

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages