📅 Date: May 5, 2024
This repository contains the code, report, and other resources for the final challenge of the IMA205 course at Télécom Paris. The project focuses on classifying dermoscopic images of skin lesions into eight diagnostic categories using machine learning techniques, including Convolutional Neural Networks (CNNs) and traditional machine learning models. The dataset used for this challenge is derived from the ISIC (International Skin Imaging Collaboration) dataset.
For detailed methodology, results, and analysis, refer to the Final Report.
The dataset provided for this challenge includes 25,331 dermoscopic images, divided into training, validation, and test sets. The training-validation set contains 18,998 images, while the remaining 6,333 images make up the test set.
Classes in the dataset:
- Melanoma
- Melanocytic nevus
- Basal cell carcinoma
- Actinic keratosis
- Benign keratosis
- Dermatofibroma
- Vascular lesion
- Squamous cell carcinoma
Additional metadata (e.g., age, gender, lesion location) was provided, which was used for further data analysis and preprocessing.
Two approaches were employed to classify the skin lesions:
-
Feature-based Classification:
- Segment the lesion and extract features based on the ABCD criteria (Asymmetry, Border irregularity, Color variation, and Diameter).
- Machine learning models such as Random Forest, SVM, and MLP were used for classification.
-
CNN-based Classification (ResNet):
- Used a ResNet101 model, known for its success in image classification tasks.
- Images were resized to 224x224, and data augmentation techniques were applied to improve model performance.
Method | Validation Accuracy | Public Test Accuracy | Private Test Accuracy |
---|---|---|---|
SVM | 0.64 | 0.38 | 0.36 |
Random Forest | 0.62 | 0.40 | 0.38 |
MLP | 0.57 | 0.38 | 0.36 |
ResNet101 | 0.76 | 0.67 | 0.63 |
ResNet101 outperformed the feature-based models in this task, demonstrating the strength of deep learning methods in handling complex image classification problems.