Accept token_ids in Shortfin LLM Server #862

stbaione · 2025-01-24T00:00:45Z

The mlperf harness that I'm building sends over input_ids instead of text. We had the option to send input_ids in a request, but never actually implemented it. This makes it so that you can either send a string prompt or a pre-tokenized prompt.

Add test to cpu integration tests that uses input_ids

Update test ids

stbaione and others added 6 commits January 23, 2025 23:58

Allow shortfin LLM server to accept input_ids along with text,

086b757

Add test to cpu integration tests that uses input_ids

Fix fixture,

7e9bb31

Update test ids

Fix encoded_prompt fixture

d06c9b9

Update path for local irpa file

488bc64

Merge branch 'main' into llm-fix-input-ids

5777acc

Fix path to local Llama file

7024793

stbaione marked this pull request as ready for review January 24, 2025 14:30

stbaione requested a review from rsuderman January 24, 2025 14:52

Remove ipa file xfails

3ba1927

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Accept token_ids in Shortfin LLM Server #862

Accept token_ids in Shortfin LLM Server #862

stbaione commented Jan 24, 2025

Accept token_ids in Shortfin LLM Server #862

Are you sure you want to change the base?

Accept token_ids in Shortfin LLM Server #862

Conversation

stbaione commented Jan 24, 2025