Audio Book To Text Sync

This program is written as an experiment of syncing an audio book to it's text. So we can show a text of this book on a screen and highlight current word. For this perpose we recognize the audio with CMU Sphinx:

pocketsphinx_continuous -infile $name.wav -hmm ./pocketsphinx-0.8/model/hmm/en_US/hub4wsj_sc_8k -lm ./pocketsphinx-0.8/model/lm/en_US/hub4.5000.DMP -backtrace yes -beam 1e-20 -pbeam 1e-20 -lw 2.0  -dict ./pocketsphinx-0.8/model/lm/en_US/hub4.5000.dic -time 1 > $name-parsed.txt

After running the command above we have a regignized text in this format:

the 1540.700000 1540.840000 0.509371
eyes 1540.850000 1541.390000 0.185112
to(2) 1541.400000 1541.590000 0.684926
double 1541.600000 1541.990000 0.963190
blow 1542.000000 1542.520000 0.255136
where(2) 1542.530000 1543.020000 0.024443
...

Of course there are many mistakes. And then we run this program to filter them and get result.

Simplofied algorithm description

We have text A (original book text) and text B (recognized text). First we get a sequence in text A and then look for this sequence in text B. Usually we find several occurances of the sequence. So we remember all of them, then we get next sequence in text A and so on.

When we done we build directed graph of this text fragments. We link sequence 1 to squence 2 if:

sequence 1 is left to sequence 2
the sequences are not cover each other

so there are so many links that usually we dont have enough memory to keep them (that's why the algorithm is optimized)

After creating links we use Bellman-Ford algirithm to find the longest path in the graph - this path is our result.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
assets		assets
classes		classes
LICENSE		LICENSE
README.md		README.md
autoload.php		autoload.php
example.php		example.php

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Audio Book To Text Sync

Simplofied algorithm description

About

Releases

Packages

Languages

License

phoenix-mstu/abooksync

Folders and files

Latest commit

History

Repository files navigation

Audio Book To Text Sync

Simplofied algorithm description

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages