Skip to content

POS Tagging is Broken for Sliced Pipelines #13225

Discussion options

You must be logged in to vote

Hi!

Sorry that this has been confusing. What you need, is to also ensure the tok2vec component is enabled:

nlp = spacy.load('en_core_web_sm', enable=['tok2vec', 'lemmatizer', 'tagger', "parser", "attribute_ruler"])

If you look at the en_core_web_sm package that's installed in your venv, you can open the config.cfg and find something like this:

[components.tagger.model.tok2vec]
@architectures = "spacy.Tok2VecListener.v1"
width = ${components.tok2vec.model.encode:width}
upstream = "tok2vec"

What this means, is that the tagger model uses the tok2vec component in the pipeline - it "listens" to it to obtain word embeddings. The parser does, too. So you should make sure to enable them together.

Replies: 2 comments

Comment options

You must be logged in to vote
0 replies
Answer selected by svlandeg
Comment options

You must be logged in to vote
0 replies
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
lang / en English language data and models feat / tagger Feature: Part-of-speech tagger feat / pipeline Feature: Processing pipeline and components
2 participants
Converted from issue

This discussion was converted from issue #13222 on January 08, 2024 15:38.