This project parses the FDA Food Code and converts it into plain text and JSON for easier use.
Currently, it doesn't do anything beyond simply pulling the text out of each page and save it to flat files by page number. Better parsing may be done in the future; contributions welcome!
- Ruby (tested against 2.1.2, but 1.9+ should work)
- Bundler (
gem install bundler
) - Rake (
gem install rake
) - pdftohtml
On OS X, use Homebrew to install pdftohtml. The pdftohtml package doesn't work with the 2013 food code
document, so install Poppler instead: brew install poppler
.
On Debian-based Linux distros, the pdftohtml
package works fine: apt-get install pdftohtml
.
Before your first use, you'll need to make sure you have the necessary gems installed. From the directory the code is
in, run bundle install
. (If you aren't using RVM or something similar, you may need to run bundler
as root/through sudo.)
Once the gems are installed, simply run rake
to download the 2013 PDF, convert it into an XML document, and generate
the flat files separated by page.