Skip to content

No proper data to initialize the Model of component 'tok2vec' #11428

Discussion options

You must be logged in to vote

Hi @lorenzo82 ,

My guess is that something wrong happened during the conversion of the JSON file to the spaCy file. I wonder where the JSON file came from? (is this from the v2.x version of spaCy?). Assuming you converted it using convert, perhaps the next step is to manually inspect the spaCy files and check for empty docs.

import spacy
from spacy.tokens import DocBin

nlp = spacy.blank("xx")  # or a language code, e.g. `en`
doc_bin = DocBin().from_disk("path/to/file.spacy")
docs = list(doc_bin.get_docs(nlp.vocab))

There could be several reasons as to why this happens. Perhaps it's in the formatting of the JSON file or an error during the conversion process.

Replies: 2 comments 7 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by lorenzo82
Comment options

You must be logged in to vote
7 replies
@achal648
Comment options

@ljvmiranda921
Comment options

@achal648
Comment options

@sunilksamanta
Comment options

@rmitsch
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feat / tok2vec Feature: Token-to-vector layer and pretraining
5 participants