Skip to content

Latest commit

 

History

History
93 lines (75 loc) · 4 KB

README.md

File metadata and controls

93 lines (75 loc) · 4 KB

BoostCamp AI Tech 7th CV-01 Object Detection Project

Team Members

동준 경윤 영석 태영 태성 세린

Contribute

Member Roles
동준 Data 직접 검토, K-fold, Dataset Add, Remove dash, Ensemble
영석 EDA, Data 직접 검토, Streamlit으로 결과 시각화, cv2에서 사용 가능한 이미지 처리 기법 탐색
경윤 EDA, Data 직접 검토, augmentation 기법 적용(salt and pepper, binarization, normalize)
태영 EDA, Data 직접 검토, super resolution, Ensemble
세린 Data 직접 검토, augmentation 기법 조사
태성 템플릿 코드 작성, Data 직접 검토, Dataset Add

Overview

When performing deep learning tasks, there are typically two main approaches: one focuses on the model, and the other on the data. In this project, we adopted a data-centric approach to tackle an OCR task related to receipts.

Dataset

This project utilizes a dataset specifically designed for OCR tasks involving receipts. The dataset contains labeled images of various receipt elements, divided into training and test sets

  • Training Images: 400
  • Training bboxes: 34623
  • Test Images: 120
  • Lanugages: 4
    • Chinese: 100
    • Japanese: 100
    • thai: 100
    • Vietnamese: 100

Development Environment

Category Details Category Details
Hardware GPU: V100 32GB × 4 Python 3.10
CUDA 12.1 PyTorch 2.1.0
PyTorch Lightning 1.8.0 Libraries Opencv-python(4.10.0.84), numpy(1.24.4)
Collaboration Tools Notion, WandB

Results

Final Results (Public, Private)

The result of Ensemble (WBF, IoU=0.3):
Super resolution (x4) + Normalize (base) + Remove dash + 3 folds (9:1 train-valid split)
Super resolution (x4) + Normalize (custom) + Remove dash

Data augmentation