In this repo, you can find the code for reproducing the experiments from our ACL 2021 paper "HateCheck: Functional Tests for Hate Speech Detection Models".
- "Data" contains raw and clean training data as well as the HateCheck test suite along with the model predictions discussed in the paper.
- Notebooks 1-4 cover the experimental pipeline: 1) loading data, 2) training BERT, 3) applying trained BERT to HateCheck, 4) evaluating BERT and commercial model results on HateCheck.
- Notebook 5 provides simple descriptive stats on the test cases in HateCheck.
- "conda-spec-file.txt" specifies all packages in the conda virtual environment in which the experiments were run.