Repo for the paper Unsupervised Simplification of Legal Texts
We have gathered a new dataset for the goal of legal text simplification. To that aim, we have selected 1000 random legal sentences from the CaseLaw Access project of Harward Law School. Then, by collaborating with the faculty and the students of Bilkent Law School, we produced 3 different simplified reference files for these 1000 sentences. We hope that this dataset can serve as a benchmark for future legal text simplification studies.
In order to run the algorithm proposed in the paper, run the following command. Python 3.6 or above is required. In particular, run:
conda create -n uslt python=3.10
conda activate uslt
git clone
cd lex-simple
pip install -r requirements.txt
python -m spacy download en_core_web_sm
python -m spacy download en
cd scripts
After running the code above, you will generate a .txt file with lexical simplifications. In order to do structural simplification on top of lexical simplification, follow the steps in In particular, run
cd .. #make sure you are in the main directory
git clone
cd DiscourseSimplification
mvn clean install -DskipTests
First, create a directory under DiscourseSimplification at edu/stanford/nlp/models/pos-tagger/english-left3words, and move the stanford nlp taggers you may find in this drive link inside these folders: Then, generate an empty file called 'input.txt' inside this directory and copy and paste the lexically simplified document generated by the code. Then, run
mvn clean compile exec:java
cd ..
Now you generated the final txt file!
You need to install easse, for which please follow the guides in
After gathering the text outputs, run