README

Dockerized HTTP service for parsing Kenya Hansard transcripts. Receives a Hansard PDF or TXT file as input.

plenaryparser implements the HTTP service for parsing Hansard plenary transcripts (e.g. http://parliament.go.ke/the-national-assembly/house-business/hansard))

committeeparser parses Hansard committee transcripts. This is not implemented yet.

Setup

pip install git+https://github.com/bnjmacdonald/hansardparser

Examples

Run Docker network

TODO: ...

Parse a transcript via POST request

import requests
import os
import json
import io

# Text transcript
f = io.StringIO('''<Header>ORAL ANSWERS TO QUESTIONS </Header>\n
Question No.799\n
<Newspeech>MR. SPEAKER: Mr. Ekidoronot in? Next Question.</Newspeech>\n
Question No.780\n
DR. MANTO asked the Minister for Agriculture :-\n
(a)	 whether he is aware that the demand for sugar will be greater than its .production by 1990; and\n
(b) whether he will, therefore, reconsider\n
the suspended plan to establish an additional sugar factory in Busia District.
''')

# POST request
url = "http://localhost:8000"
res = requests.post(url, files={'file': f}, params={filetype': 'txt', 'line_type_labeler': 'supervised', 'line_speaker_span_labeler': 'hybrid'})
assert res.status_code == 200
data = json.loads(res.text)
print(data)

Contributing

If you are interested in contributing to this project, contact @bnjmacdonald ([email protected]).

Name		Name	Last commit message	Last commit date
Latest commit History 95 Commits
.circleci		.circleci
data		data
hansardparser		hansardparser
scripts		scripts
vignettes		vignettes
LICENSE		LICENSE
README.md		README.md
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Setup

Examples

Run Docker network

Parse a transcript via POST request

Contributing

About

Releases

Packages

Languages

License

bnjmacdonald/hansardparser

Folders and files

Latest commit

History

Repository files navigation

README

Setup

Examples

Run Docker network

Parse a transcript via POST request

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages