-
I implemented "Question-Answering" using "deepset/bert-base-cased-squad2" model from Huggingface.
However, if I change the huggingface model to 'mrm8488/longformer-base-4096-finetuned-squadv2', which is a Longformer model that takes input paragraphs up to 4096 tokens, I get the following error:
How can I solve this problem? Any help is welcome. |
Beta Was this translation helpful? Give feedback.
Replies: 4 comments 5 replies
-
I think the problem is:
The model doesn't take token_type_ids as input. If you convert the model with
|
Beta Was this translation helpful? Give feedback.
-
Thanks @frankfliu for your answer. Correct, I converted the model with
|
Beta Was this translation helpful? Give feedback.
-
This is the code to reproduce the
import ai.djl.ModelException;
import ai.djl.huggingface.translator.QuestionAnsweringTranslatorFactory;
import ai.djl.inference.Predictor;
import ai.djl.modality.nlp.qa.QAInput;
import ai.djl.repository.zoo.Criteria;
import ai.djl.repository.zoo.ZooModel;
import ai.djl.training.util.ProgressBar;
import ai.djl.translate.TranslateException;
import java.io.IOException;
import java.nio.file.Paths;
public class HuggingFaceLongformQaInference {
public static void main(String[] args) throws IOException, TranslateException, ModelException {
String question = "Where is my house?";
String paragraph = "My house is in London.";
QAInput input = new QAInput(question, paragraph);
String answer = HuggingFaceLongformQaInference.qa_predict(input);
System.out.println(answer); // --> London
}
public static String qa_predict(QAInput input) throws IOException, TranslateException, ModelException {
QuestionAnsweringTranslatorFactory questionAnsweringTranslatorFactory = new QuestionAnsweringTranslatorFactory();
Criteria<QAInput, String> criteria = Criteria.builder()
.setTypes(QAInput.class, String.class)
.optModelPath(Paths.get("./model/longformer-base-4096-finetuned-squadv2/longformer-base-4096-finetuned-squadv2.pt"))
.optTranslatorFactory(new QuestionAnsweringTranslatorFactory())
.optEngine("PyTorch")
.optProgress(new ProgressBar())
.build();
ZooModel<QAInput, String> model = criteria.loadModel();
try (Predictor<QAInput, String> predictor = model.newPredictor()) {
return predictor.predict(input);
}
}
} gradle dependencies:
|
Beta Was this translation helpful? Give feedback.
-
@xxx24xxx You are right, padding the input doesn't really work. change:
to:
You don't need padding any more, the exsiting |
Beta Was this translation helpful? Give feedback.
@xxx24xxx You are right, padding the input doesn't really work.
I think you have to modify the model code in
modeling_longformer.py
line 678change:
to:
You don't need padding any more, the exsiting
djl-convert
and java code should work.