In this section, as the IEEE Firat Computer Society Image Processing Team, We have learned image processing and deep learning. We used them in the project.
The repository is basically about detection of letters and digits in an image with deep learning. We Learned how to do, and we applied what is learned.
The project have two steps.
- Step One
- In this step, main purpose is to classify single letter or digit in the input image.
- The model is trained in this step.
- Step Two
- In this step, main purpose is detect multiple letter and digit in the input image.
- The model, that was trained previous step, is used for classifying the detections
This step is main structure of the project. Because, The model was trained in this step, and it uses step two.
First of all, The model was needed data for training. Data collection and preparing are time consuming process, and
require different skills. For those reasons, we as a team have decided to use a ready dataset. We used balanced emnist
data set. There are digits and letters in the dataset.
There are some problems in the dataset. If the dataset is checked, the problems can be seen.
Some of them;
- Too smiler different labeled data
- Incorrectly labeled data
- Out of label images
Those problems affect the model accuracy in a bad way.
The model can be observed with a confusion matrix.
Prediction bar plots are helped us for more clear observation.
If we want to classify multiple areas in an image, we should find the areas. Because we want to use the model.
We used image processing technique for find the areas. You can use just a model that is trained for this purpose without image proccesing techniques,
but you cannot use only image classification model for this. For that reason, we used image processing techniques in this step.
Every detected area passing the model, and the model predict the input. For every area, a rectangle and the prediction letter is drawn.
We trained two model. One of them is using 1D convolution layers(1dmodel), other is using 2D convolution layers(2dmodel).
Accuracy of 1dmodel is worse then 2dmodel. Probobly, 1dmodel can preform better with better hyperparameter tuning. The project is using 2dmodel because of accuracy issue, and lesser image process steps.
You can find two of them in models folder.
First of all, you should clone the repository. After that, you should install requirements. You can use the CLI commend.
pip install -r requirements.txt
Now you can run the project
python main.py -s <image_path>
python main.py -m <image_path>
Developed by the Computer Society Image Processing Team.