Skip to content

Latest commit

 

History

History
114 lines (96 loc) · 4.74 KB

README.md

File metadata and controls

114 lines (96 loc) · 4.74 KB
                                                            UrbanSound8K Audio Classification

This project focuses on audio classification using the UrbanSound8K dataset. The objective is to classify audio samples into different sound classes such as dog bark, siren, car horn, and more, using a deep learning model built with TensorFlow and Keras.

                                                             Table of Contents

Project Overview Dataset Preprocessing Feature Extraction Model Architecture Training Results Installation Usage Contributing License Project Overview This project applies machine learning techniques to classify urban sound samples. The UrbanSound8K dataset contains 8,732 labeled sound excerpts (≤ 4s) of urban sounds, categorized into 10 classes such as air conditioner, dog bark, children playing, and others.

                                                         Dataset

The dataset used is the UrbanSound8K dataset, which contains audio samples classified into 10 different urban sound categories. You can download the dataset and place it in the Downloads/UrbanSound8k directory.

                                                          Classes in the dataset:

Air Conditioner Car Horn Children Playing Dog Bark Drilling Engine Idling Gun Shot Jackhammer Siren Street Music Preprocessing The audio samples are preprocessed using the librosa library:

Sample rate: 22,050 Hz Audio features are extracted using Mel-Frequency Cepstral Coefficients (MFCCs), which are popular in speech and audio processing tasks. Feature Extraction For each audio sample, 40 MFCC features are extracted. These features are then averaged to obtain a fixed-length feature vector for every sample.

                                                    python
                                                    Copy code

def features_extractor(file): audio, sample_rate = librosa.load(file, res_type='kaiser_fast') mfccs_features = librosa.feature.mfcc(y=audio, sr=sample_rate, n_mfcc=40) mfccs_scaled_features = np.mean(mfccs_features.T, axis=0) return mfccs_scaled_features Model Architecture A neural network model is built using TensorFlow and Keras. The architecture consists of:

Input layer: 40 features (MFCCs) Hidden layers: Dense layers with ReLU activation and dropout for regularization Output layer: 10 neurons (softmax activation) for 10 sound classes python Copy code model = Sequential() model.add(Dense(100, input_dim=40, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(200, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(100, activation='relu')) model.add(Dropout(0.5)) model.add(Dense(10, activation='softmax')) Training The model is trained for 100 epochs using the Adam optimizer and categorical crossentropy loss.

                                            python

Copy code model.compile(loss='categorical_crossentropy', metrics=['accuracy'], optimizer='adam') num_epochs = 100 num_batch_size = 32 checkpointer = ModelCheckpoint(filepath='saved_models/audio_classification.keras', verbose=1, save_best_only=True)

start = datetime.now() model.fit(X_train, y_train, batch_size=num_batch_size, epochs=num_epochs, validation_data=(X_test, y_test), callbacks=[checkpointer], verbose=1) duration = datetime.now() - start print("Training completed in time: ", duration) Results After training, the model achieves an accuracy of around 75% on the test dataset.

                                        python
                                        Copy code

test_accuracy = model.evaluate(X_test, y_test, verbose=0) print(f"Test Accuracy: {test_accuracy[1]}") Installation To set up the project locally, follow these steps:

Clone the repository:

                                         bash
                                         Copy code

git clone https://github.com/yourusername/urban-sound-classification.git Install dependencies:

                                          bash
                                          Copy code

pip install -r requirements.txt Download and extract the UrbanSound8K dataset to the Downloads/UrbanSound8k/ directory.

                                         Usage

Run the script to extract features and train the model:

                                        bash
                                        Copy code

python train_model.py Once the model is trained, it will be saved in the saved_models/ directory. You can use the trained model to classify new audio files.

                                      Contributing

If you want to contribute to this project, feel free to submit a pull request or open an issue.

                                      License

This project is licensed under the MIT License. See the LICENSE file for more details.