PolDeepNer2 is an improved version of PolDeepNer. The tool is designed to recognize and categorize named entities utilizing neural networks and transfomer-based language models.
The tool is provided with a list of pre-trained models for Polish and other languages.
It contains a pre-trained model trained on the NKJP corpus which recognizes nested annotations of the following types:
- Michał Marcińczuk [email protected]
- Jarema Radom
notebooks/pdn2_cpu.py |
This notebook present how to install and use module API to process a raw text on CPU. |
PolDeepNer2 achieves the SOTA results on the PolEval 2018 dataset.
Model | Score | F1 Overlap | F1 Exact | Score main | Time CPU | Time GPU | Source |
---|---|---|---|---|---|---|---|
PolDeepNer2 | |||||||
HerBERT large, spacy-ext, sq | 92.1 | 92.7 | 89.9 | ~2m 24s | |||
Polish RoBERTa base, spacy-ext, sq | 91.4 | 91.9 | 89.1 | ~1.5 h | ~2m 8s | ||
Polish RoBERTa base, toki | 90.0 | 90.5 | 87.7 | 92.40 | ~6h 30m | ~6m 30s | |
Polish RoBERTa base, spacy-ext | 89.8 | 90.4 | 87.4 | 92.20 | ~8m 2s | ||
Systems published after PolEval 2018 | |||||||
Dadas et al. 2020 [1] | 88.6 | 87.0 | 89.0 | - | - | - | link |
Polish RoBERTa (large) [1] | - | - | - | 89.98 | - | - | link |
Polish RoBERTa (base) [1] | - | - | - | 87.94 | - | - | link |
spaCy (pl_spacy_model) | - | - | - | 87.50 | ~3m | - | link |
Top 3 systems from PolEval 2018 | |||||||
Applica.ai | 86.6 | 87.7 | 82.6 | - | - | - | link |
PolDeepNer | 85.1 | 85.9 | 82.2 | - | - | ~9m | link |
Liner2 | 81.0 | 81.8 | 77.8 | - | ~3m | - | link |
[1] The model is not available. Only the evaluation results were published.
Model | Library | Tokenizer | Model loading [s] | Preprocessing [s] | NE recognition [s] | Total [s] |
---|---|---|---|---|---|---|
Polish RoBERTa base | fairseq | - | 12.28 | 50.90 | 65.23 | 128.4 |
HerBERT large | HuggingFace | HerbertTokenizerFast | 18.44 | 50.83 | 103.70 | 173.0 |
HerBERT large | HuggingFace | XLMTokenizer | 18.33 | 51.42 | 177.50 | 247.3 |
- Dataset size: 1828 document (3M characters).
- GPU: RTX Titan (24 GB, 4608 CUDA cores).
Size [Million chars] | NER time [minutes] | |
---|---|---|
PolEval 2018 NER test dataset | 3 | 2.6 |
Monthly volume of news from Polish news portals [70 sources] | 160 | 136.9 |
Polish Wikipedia (2013 dump) | 1000 | 855.6 |
Annual volume of news from Polish news portals [70 sources] | 1920 | 1642.7 |
Inner-corpora evaluation
Model | Eval | Precision | Recall | F-measure | Support | Source |
---|---|---|---|---|---|---|
PolDeepNer2 (kpwr_n82_base) | KPWr | 75.02 | 77.67 | 76.32 | 4430 | |
PolDeepNer2 (kpwr_n82_large) | KPWr | 77.05 | 78.79 | 77.91 | 4430 | |
PolDeepNer (n82-elmo-kgr10) | KPWr | 73.97 | 75.49 | 74.72 | 4430 | link |
--- | ||||||
PolDeepNer2 (cen_n82_base) | CEN | 84.64 | 85.95 | 85.29 | 1423 | |
PolDeepNer2 (cen_n82_large) | CEN | 86.94 | 88.40 | 87.67 | 1423 |
Cross-corpora evaluation
Model | Eval | Precision | Recall | F-measure | Support |
---|---|---|---|---|---|
PolDeepNer2 (kpwr_n82_base) | CEN | 80.90 | 81.87 | 81.38 | 1423 |
PolDeepNer2 (kpwr_n82_large) | CEN | 80.16 | 82.08 | 81.11 | 1423 |
--- | |||||
PolDeepNer2 (cen_n82_base) | KPWr | 58.58 | 64.79 | 61.53 | 4430 |
PolDeepNer2 (cen_n82_large) | KPWr | 61.38 | 66.66 | 63.91 | 4430 |
Create and activate conda environment:
conda create -n pdn2 python=3.6
conda activate pdn2
Install CUDA, CuDNN and Torch:
conda install -c anaconda cudatoolkit=10.1
conda install -c anaconda cudnn
Install PolDeepNer2:
pip install https://pypi.clarin-pl.eu/packages/poldeepner2-0.5.0-py3-none-any.whl#md5=6a6131d1b3d104f0bbed87ec6969a841
Install spacy model
python -m spacy download pl_core_news_sm
Download evaluation dataset
wget http://mozart.ipipan.waw.pl/~axw/poleval2018/POLEVAL-NER_GOLD.json -O POLEVAL-NER_GOLD.json
Process the dataset:
python process_poleval.py \
--input POLEVAL-NER_GOLD.json \
--output pdn2_nkjp_roberta_base_sq.json \
--model nkjp-base-sq \
--device cuda:0
Output:
Model loading time : 12.28 second(s)
Data preprocessing time : 50.9 second(s)
Data NE recognition time : 65.23 second(s)
Total time : 128.4 second(s)
Data size: : 3.072M characters
Evaluate:
python poleval_ner_test.py \
--goldfile POLEVAL-NER_GOLD.json \
--userfile pdn2_nkjp_roberta_base_sq.json
Output:
OVERLAP precision: 0.927 recall: 0.912 F1: 0.919
EXACT precision: 0.899 recall: 0.884 F1: 0.891
Final score: 0.914
Exact TP=32971 ; FP=3709; FN=4335
Process the dataset:
python process_poleval.py \
--input POLEVAL-NER_GOLD.json \
--output pdn2_nkjp_herbert_large_sq.json \
--model nkjp-herbert-large-sq \
--device cuda:0
Output:
Model loading time : 18.44 second(s)
Data preprocessing time : 50.83 second(s)
Data NE recognition time : 103.7 second(s)
Total time : 173.0 second(s)
Data size: : 3.072M characters
Evaluate:
python poleval_ner_test.py \
--goldfile POLEVAL-NER_GOLD.json \
--userfile pdn2_nkjp_herbert_large_sq.json
Output:
OVERLAP precision: 0.929 recall: 0.922 F1: 0.926
EXACT precision: 0.903 recall: 0.896 F1: 0.900
Final score: 0.921
Exact TP=33433 ; FP=3596; FN=3873
- This code is based on xlm-roberta-ner by mohammadKhalifa.
- Language models for Polish: