End-to-End Speech Recognition Models

This repository contains end-to-end automatic speech recognition models.This repository does not include training or audio or text preprocessing codes. If you want to see the code other than the model, please refer to here.
Many speech recognition open sources contain all the training-related code, making it hard to see only the model structure. So I have created a repository for only the models I've implemented and make them public.
I will continue to add to this the speech recognition models that I implement.

Implementation List

Deep Speech 2
Dario Amodei et al. Deep Speech2: End-to-End Speech Recognition in English and Mandarin
SeanNaren. deepspeech.pytorch
Listen, Attend and Spell (modified version)
Wiliam Chan et al. Listen, Attend and Spell
Takaaki Hori et al. Advances in Joint CTC-Attention based E2E ASR with a Deep CNN Encoder and RNN-LM
IBM. Pytorch-seq2seq
clovaai. ClovaCall
Speech Transformer
Ashish Vaswani et al. Attention Is All You Need
Yuanyuan Zhao et al. The SpeechTransformer for Large-scale Mandarin Chinese Speech Recognition
kaituoxu. Speech-Transformer
Jasper
Jason Li et al, Jasper: An End-to-End Convolutional Neural Acoustic Model
NVIDIA. DeepLearningExample
Voice Activity Detection (1 dimensional Resnet Model)
filippogiruzzi. voice_activity_detection

Troubleshoots and Contributing

If you have any questions, bug reports, and feature requests, please open an issue on Github.

I appreciate any kind of feedback or contribution. Feel free to proceed with small issues like bug fixes, documentation improvement. For major contributions and new features, please discuss with the collaborators in corresponding issues.

Code Style

I follow PEP-8 for code style. Especially the style of docstrings is important to generate documentation.

License

This project is licensed under the Apache-2.0 LICENSE - see the LICENSE.md file for details

Author

Soohwan Kim @sooftware
Contacts: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 18 Commits
models		models
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

End-to-End Speech Recognition Models

Implementation List

Troubleshoots and Contributing

Code Style

License

Author

About

Releases

Packages

Languages

License

sooftware/End-to-End-Speech-Recognition-Models

Folders and files

Latest commit

History

Repository files navigation

End-to-End Speech Recognition Models

Implementation List

Troubleshoots and Contributing

Code Style

License

Author

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages