Skip to content

alfred-ls/NER-prodigy-logWandB

Repository files navigation

🪐 Prodigy / spaCy / wandb pipeline based on 🪐 spaCy Projects

Use Case 01 NER BBC & NG news

  • Annotation process via Prodigy annotation tool
  • Weights & Biases for logging of training experiments

NER project for spaCy v3. The project data comes from kaggle:

Label scheme:

Component Label
NER PERSON
ENTITY_RULER EMAIL

📋 project.yml

The project.yml defines the data assets required by the project, as well as the available commands and workflows.

⏯ Commands

The following commands are defined by the project. They can be executed using spacy project run [name]. Commands are only re-run if their inputs have changed.

Command Description
data-to-spacy Merge your annotations and create data in spaCy's binary format
train_spacy Train a named entity recognition model with spaCy and log the results via wandb
train_prodigy Train a named entity recognition model with Prodigy
train_curve Train the model with Prodigy by using different portions of training examples to evaluate if more annotations can potentially improve the performance
evaluate Evaluate the model and export metrics via spaCy
visualize-model Visualize the model's output interactively using Streamlit
visualize-data Visualize the data interactively using Streamlit
package Package the trained model so it can be installed

⏭ Workflows

The following workflows are defined by the project. They can be executed using spacy project run [name] and will run the specified commands in order. Commands are only re-run if their inputs have changed.

Workflow Steps
all data-to-spacytrain_spacyevaluate
all_prodigy train_prodigytrain_curve

🗂 Assets

The following raw assets are defined by the project.

File Source Description
assets/raw/UC1_train_meta.jsonl Local JSONL-formatted raw training data (1778 docs)
assets/raw/UC1_eval_meta.jsonl Local JSONL-formatted raw development data (593 docs)

💯 Insights

Overall annotation count:

# Annotations # PERSON # EMAIL
correct_UC01_train 3500 1011 272

Annotation details:

W&B Prodigy report

Results:

image

image

About

NER BBC & NG documents

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published