Skip to content

nugglet/Music-Genre-Classification

Repository files navigation

Music-Genre-Classification

50.039 Theory and Application of Deep Learning: Project

Usage

Git clone this project. To predict the music genre of your audio file, just use

$ python predict_audio.py --music_path [music-path] # defaults to use our final model

The outputs will be the classification labels for each predetermined chunk of the music file that is aggregated into its normalised value counts. For example,

Predictions for test.mp3:
classical    0.989474
jazz         0.010526

This means that the model is 98.9% percent sure that it is of the classical genre.

Setup

Packages required

  • Python 3.9 and above
  • torch
  • sklearn
  • pandas
  • numpy
  • librosa
  • tqdm
  • seaborn
  • scikit-plot
  • matplotlib
  • ffmpeg (for mp3 input - you might need to install it as a standalone app from their website or your preferred package manager if pip installing still yields audio backend error)

Alternatively, you can install from the requirements file from our virtual environment:

$ pip install -r requirements.txt

Process the data

$ python [script-name] -h # for help

Model

The config used to create the final model is as follows in cnn_2d_parallels.yaml:

n_epochs: 15
batch_size: 32
optimiser_cfg:
  lr: 0.001

We initially trained the model using a 20% test split. These are the results. The actual model is trained on the full dataset.

From Models/2022-04-24_21-38-09_CNN_2D_Split_True.pt:

              	precision    recall  f1-score   support

       blues       0.89      0.73      0.80       208
   classical       0.88      0.97      0.92       202
     country       0.64      0.75      0.69       192
       disco       0.90      0.61      0.72       201
      hiphop       0.83      0.78      0.80       209
        jazz       0.95      0.76      0.85       186
       metal       0.81      0.93      0.87       211
         pop       0.72      0.88      0.79       204
      reggae       0.88      0.73      0.80       212
        rock       0.52      0.70      0.60       175

    accuracy                           0.79      2000
   macro avg       0.80      0.78      0.78      2000
weighted avg       0.81      0.79      0.79      2000

Training figure Confusion matrix

About

50.039 Theory and Application of Deep Learning: Project

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Contributors 3

  •  
  •  
  •