Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tesseract within the app #508

Open
CakeonCake111 opened this issue Jan 3, 2025 · 3 comments
Open

Tesseract within the app #508

CakeonCake111 opened this issue Jan 3, 2025 · 3 comments
Labels
bug Something isn't working

Comments

@CakeonCake111
Copy link

I think the implementation of tesseract within the app is great. Although, I'm not sure if it uses my version or a untrained version.

What I mean is that when I turn on tesseract using the OCR, it seems to keep using(what I'm assuming) the windows OCR, and not tesseract(or at least the one that I've trained.)

TwoTest is the image with the text, and aaaa is the folder on my desktop that includes said image.

TwoTest

When performing a Tesseract OCR read through cmd, I get the following:

THANK YOU FOR THE PEN, YI.
I JUST THOUGHT OF SOME NEW
IDEAS, AND I WANT TO WRITE

THEM DOWN QUICKLY SO I DON'T
FORGET.

Using your program, with the tesseract option ticked, it gives me these results:

THANK you FOR THE PEN, YI.
1 OUST THOUGHT OF SOME NEW
IPEAS, ANV 1 WANT TO WRITE
THEM DOWN QUICKLY SO 1 PONTI
FORGET.

Turning off Tesseract would give me the same paste results.
I've tried restarting the app.
I'm not sure if the language packs area is relevant, but it stays blank as well.
My open folder in the app is "C:\Program Files\Tesseract-OCR\tesseract.exe" just to confirm that I need to path the executable, not the folder.

@CakeonCake111 CakeonCake111 added the bug Something isn't working label Jan 3, 2025
@CakeonCake111
Copy link
Author

Just to update, it looks like only vanilla tesseract is working with the program and anything modified would remove the option for tesseract and make it into a blank dropbox.

@TheJoeFin
Copy link
Owner

Can you be more explicit on what you mean by vanilla Tesseract and "anything modified"? You should be able to point to the Tesseract EXE and Text Grab will send Tesseract CLI arguments to that EXE.

@CakeonCake111
Copy link
Author

What I meant by "vanilla tesseract" is that currently I'm seeing results come from a fresh downloaded version. So essentially, if you were to modify the tesseract-ocr file(I'm not sure if it's any part of the file), to which in my case, I would replace the eng.traineddata in the tessdata folder, to a different eng.traineddata model provided in their github(e.g tessdata_best, or tessdata_fast), it would break your app. I think this is a bug because people that have previous trained models of their own would be unable to use your app in tandem with their version of tesseract. I hope that helps 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants