BabbleBot

Text synthesis bot that learns from a given text corpus using an LSTM-based deep neural network and produces new text given a "seed" prompt.

We treat the text as one long list of tokens, break it up into <batch_size> contiguous sequences, divide this stack of long sequences into batches, and process the batches in order using a deep neural network with a stateful LSTM. The learning task is to predict the next token in each sequence given the current timestamp and the current state of the LSTM layer.

The example used is the "Sample European Parliament Proceedings Parallel Corpus" downloaded directly using the NLTK package:

    bot = babble.BabbleBot()
    
    # Load the text corpus as a single list of words
    text = load_text_corpus()
    
    # Parse the data and generate the encodings
    # required for training
    bot.parse_text(text)
    
    # Instantiate the model and train
    bot.learn(epochs=100)
    
    # Give the bot a seed string to get it babbling!
    seed = "Today"
    for i in range(20):
        print(bot.babble(seed, reset=False))

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
babble		babble
.gitignore		.gitignore
README.md		README.md
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

BabbleBot

About

Releases

Packages

Languages

naviddianati/BabbleBot

Folders and files

Latest commit

History

Repository files navigation

BabbleBot

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages