Skip to content

Comparative analysis of various machine learning algorithms for detecting Parkinson's Disease (PD) from an audio dataset. The project explores the use of K-Nearest Neighbors (KNN), Logistic Regression, Random Forest, and Gradient Boosting algorithms to classify audio features associated with PD.

Notifications You must be signed in to change notification settings

thanushree267/Comparative-Analysis-of-ML-models-for-detection-of-Parkinson-s-Disease-from-audios.

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Comparative-Analysis-of-ML-models-for-detection-of-Parkinson-s-Disease-from-audios.

This project focuses on detecting Parkinson's disease using machine learning models trained on audio features. The study compares the performance of Logistic Regression, Gradient Boosting, Random Forest, and K-Nearest Neighbors (KNN) classifiers to determine the most effective approach for accurate detection.

Features Preprocessing Audio Data: Extract relevant features from audio recordings. Training and Evaluation: Compare four ML models on the same dataset. Performance Metrics: Analyze models based on accuracy, precision, recall, and F1-score. Flowchart Visualization: A detailed flowchart to outline the process. Project Structure plaintext Copy code parkinsons-disease-detection/
├── codes/ # Folder containing code for each ML model
│ ├── logistic_regression.py
│ ├── gradient_boosting.py
│ ├── random_forest.py
│ ├── knn.py
├── data/ # Dataset folder (add your audio dataset here)
│ ├── parkinsons_audio_data.csv
├── flowchart.png # Flowchart illustrating the methodology
├── README.md # Project documentation
└── results/ # Folder for saving model results and graphs
├── performance_metrics.csv
├── model_comparison.png
Dataset The dataset contains audio recordings labeled as Parkinson's and non-Parkinson's. Audio features include jitter, shimmer, HNR (Harmonic-to-Noise Ratio), and other parameters essential for detecting vocal impairment. Models Used Logistic Regression: A simple yet effective classifier for binary problems. Gradient Boosting: An ensemble method that optimizes classification accuracy through iterative boosting. Random Forest: A robust ensemble learning method using decision trees. K-Nearest Neighbors (KNN): A non-parametric algorithm based on proximity. Workflow

  1. Preprocessing Extract features from the audio recordings using libraries such as Librosa or Praat. Normalize and split the dataset into training and testing sets.

  2. Training Train each model using the same dataset and perform hyperparameter tuning for optimal performance.

  3. Evaluation:
    Compare models based on performance metrics such as: Accuracy, Precision, Recall. F1-Score, Use a confusion matrix to visualize predictions.

  4. Results Summarize results in a tabular format for easy comparison. Visualize performance metrics using bar graphs. How to Run Clone the repository:

bash Copy code git clone https://github.com/your-username/parkinsons-disease-detection.git
cd parkinsons-disease-detection
Install the necessary libraries:

Run the individual model scripts:

bash python codes/lr.py
python codes/final.py
python codes/rf.py
python codes/knn.py

Flowchart: image

About

Comparative analysis of various machine learning algorithms for detecting Parkinson's Disease (PD) from an audio dataset. The project explores the use of K-Nearest Neighbors (KNN), Logistic Regression, Random Forest, and Gradient Boosting algorithms to classify audio features associated with PD.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%