Neural Turing Machines |
Arxiv 2014 |
Coupling neural networks to external memory resources, which they can interact with by attentional processes. |
Neural Machine Translation by Jointly Learning to Align and Translate |
ICLR 2015 |
Conjecture that the use of a fixed-length vector is a bottleneck in improving the performance of this basic encoder-decoder architecture, and propose to extend this by allowing a model to automatically (soft-)search for parts of a source sentence that are relevant to predicting a target word, without having to form these parts as a hard segment explicitly. |
Effective Approaches to Attention-based Neural Machine Translation |
EMNLP 2015 |
Examines two simple and effective classes of attentional mechanism: a global approach which always attends to all source words and a local one that only looks at a subset of source words at a time. |
Modeling Coverage for Neural Machine Translation |
ACL 2016 |
Propose a coverage vector to keep track of the attention history. The coverage vector is fed to the attention model to help adjust future attention, which lets NMT system to consider more about untranslated source words. |
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation |
Arxiv 2016 |
Present GNMT, Google's Neural Machine Translation system to improve parallelism, accelerate the final translation speed, improve handling of rare words and encourages generation of an output sentence that is most likely to cover all the words in the source sentence |
Attention Is All You Need |
NIPS 2017 |
1. Propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. 2. Superior in quality while being more parallelizable and requiring significantly less time to train. |
OpenNMT: Open-Source Toolkit for Neural Machine Translation |
ACL 2017 |
Describe an open-source toolkit for neural machine translation (NMT). The toolkit prioritizes efficiency, modularity, and extensibility with the goal of supporting NMT research into model architectures, feature representations, and source modalities, while maintaining competitive performance and reasonable training requirements. |
Neural Machine Translation and Sequence-to-sequence Models: A Tutorial |
Arxiv 2017 |
1. Introduce a new and powerful set of techniques variously called "neural machine translation" or "neural sequence-to-sequence models". 2. Explain the intuition behind the various methods covered, then delves into them with enough mathematical detail to understand them concretely, and culiminates with a suggestion for an implementation exercise, where readers can test that they understood the content in practice. |
Improving the Transformer Translation Model with Document-Level Context |
EMNLP 2018 |
1. Extend the Transformer model with a new context encoder to represent document-level context, which is then incorporated into the original encoder and decoder. 2. Introduce a two-step training method to take full advantage of abundant sentence-level parallel corpora and limited document-level parallel corpora. |