[Feature] Long context benchmark enhancement #201

joshuayao · 2024-11-12T00:41:00Z

HELMET supports OPEA LLM endpoint.

minmin-intel · 2024-12-17T17:58:21Z

I was trying to reproduce HELMET results. I implemented a model class to interact with vllm-gaudi: https://github.com/minmin-intel/GenAIEval/blob/test-helmet/evals/evaluation/HELMET/model_utils.py#L204

So far, I have tested kilt_nq dataset in RAG category. I saw accuracy numbers are close to the results published in paper when input length = 8k, but 64k results have big differences when using vllm endpoint. However, using transformer pipeline gets similar results to the paper at both 8k and 64k length. Debugging WIP.

joshuayao · 2025-01-06T08:41:43Z

I was trying to reproduce HELMET results. I implemented a model class to interact with vllm-gaudi: https://github.com/minmin-intel/GenAIEval/blob/test-helmet/evals/evaluation/HELMET/model_utils.py#L204

So far, I have tested kilt_nq dataset in RAG category. I saw accuracy numbers are close to the results published in paper when input length = 8k, but 64k results have big differences when using vllm endpoint. However, using transformer pipeline gets similar results to the paper at both 8k and 64k length. Debugging WIP.

@minmin-intel , have you made progress on that?

joshuayao added the feature New feature or request label Nov 12, 2024

joshuayao added this to the v1.2 milestone Nov 12, 2024

joshuayao added this to OPEA Nov 12, 2024

joshuayao assigned lkk12014402 and XinyaoWa Nov 22, 2024

joshuayao moved this to In progress in OPEA Jan 7, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] Long context benchmark enhancement #201

[Feature] Long context benchmark enhancement #201

joshuayao commented Nov 12, 2024 •

edited

Loading

minmin-intel commented Dec 17, 2024

joshuayao commented Jan 6, 2025

[Feature] Long context benchmark enhancement #201

[Feature] Long context benchmark enhancement #201

Comments

joshuayao commented Nov 12, 2024 • edited Loading

minmin-intel commented Dec 17, 2024

joshuayao commented Jan 6, 2025

joshuayao commented Nov 12, 2024 •

edited

Loading