recall and f1-score is pretty low (~0.65) for unseen instances of custom entity on transfer-learned model. #9283

nachiket273 · 2021-09-23T20:15:09Z

nachiket273
Sep 23, 2021

I have a dataset annotated with custom entity. Each data point is long text (not a single sentence), possibly with multiple entities. The corpus size is around 1200 texts. This corpus divided into train-validation-test set as follows:

train-set(~60% of the data)
validation set(~20% containing some instances which are not present in training set for the entity)
test-set(~20% containing some instances that are not present in either train or validation set for entity).

I'm using transfer learning with pretrained en_core_web_sm model.
I have also custom function to get precision-recall-f1 score separately for unseen instances in the dataset. (based off get_ner_prf from spacy)

When i train model, the precision , recall and f1-score values reach till 1 for seen instances of the entity in the validation set , but it has very poor recall on unseen instances.

When predictions made on the test set, model has very poor performance, especially on unseen instances (~0.55 recall and ~0.68 f1 score).

Are there any recommendations to improve the performance of the model (especially for unseen instances) ?

Answered by polm

Sep 26, 2021

Can you give a specific example of what kind of entities you're trying to recognize?

Typical solutions for poor generalization are to get more training data or to use data augmentation to make your model more robust. If your model is failing to generalize it's usually because it doesn't have enough data to find patterns.

Your train, validation, and test sets shouldn't have any overlap - it's OK if there's some, but being different datasets is the point. The fact that the model can perfectly recall stuff from the train set isn't interesting.

Also, depending on what you're trying to recognize 70F1 isn't that bad. There's lots of NER applications where that's low but maybe you have a hard ca…

View full answer

polm · 2021-09-26T04:29:04Z

polm
Sep 26, 2021

Can you give a specific example of what kind of entities you're trying to recognize?

Typical solutions for poor generalization are to get more training data or to use data augmentation to make your model more robust. If your model is failing to generalize it's usually because it doesn't have enough data to find patterns.

Your train, validation, and test sets shouldn't have any overlap - it's OK if there's some, but being different datasets is the point. The fact that the model can perfectly recall stuff from the train set isn't interesting.

Also, depending on what you're trying to recognize 70F1 isn't that bad. There's lots of NER applications where that's low but maybe you have a hard case. Hard to say without more details.

2 replies

nachiket273 Sep 26, 2021
Author

Hi, its very domain specific data which makes sense only in that domain context ( e.g threats in cyber security). I tried to divide the data at sentence level also, but the performance went down at sentence level ( recall ~ 0.41 , f1 ~ 0.56).

polm Sep 26, 2021

OK. Are "threats" things like "this code uses a [buffer exploit]" or "this is a variant of [l33th4x]", with the stuff in brackets being the entity? It's hard to give advice without specific examples.

If your entities are longer things like phrases then that's a really hard problem. If you need to recognize leetspeak or something that might also might not work that well with the default features.

Multi-sentence vs single-sentence docs usually doesn't make a huge difference in NER, so I'm a bit surprised you see that much of a change.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

recall and f1-score is pretty low (~0.65) for unseen instances of custom entity on transfer-learned model. #9283

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 1 comment 2 replies

{{title}}

{{title}}

{{title}}

Select a reply

recall and f1-score is pretty low (~0.65) for unseen instances of custom entity on transfer-learned model. #9283

nachiket273 Sep 23, 2021

Replies: 1 comment · 2 replies

polm Sep 26, 2021

nachiket273 Sep 26, 2021 Author

polm Sep 26, 2021

nachiket273
Sep 23, 2021

Replies: 1 comment 2 replies

polm
Sep 26, 2021

nachiket273 Sep 26, 2021
Author