biRNN

Bidirectional RNN [GRU + LSTM] for language identification using SSC

First run setup.py as "python setup.py install" to install the MFCC and other DCT based features

Convert your audio files using Audacity or any other sound editing tool to a 16-bit PCM Wav format

Add an "e" or an "s" to the filename at the beginning based on whether it is in English or Spanish e.g. Platero y Yo (spanish) -> "splateroyyo.wav"

run gen_features.py

This will generate 2 set of features - 1. Signal features in the frequency domain divided into blocks 2. MFCC features (or whichever you want from the 'features' folder)

Give these as input to you NN architecture.

For best results - Signal features to a ConvNet and MFCC directly to an LSTM or GRU

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
docs		docs
features		features
.gitignore		.gitignore
README.md		README.md
gen_features.py		gen_features.py
gen_x.py		gen_x.py
network_utils.py		network_utils.py
setup.py		setup.py
splateroyyo.wav		splateroyyo.wav
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

biRNN

About

Releases

Packages

Contributors 2

Languages

rowhanm/biRNN

Folders and files

Latest commit

History

Repository files navigation

biRNN

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages