Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vllm + api方式调用失败 #1144

Closed
xiaohaiqing opened this issue Mar 13, 2024 · 1 comment
Closed

vllm + api方式调用失败 #1144

xiaohaiqing opened this issue Mar 13, 2024 · 1 comment
Labels
question Further information is requested

Comments

@xiaohaiqing
Copy link

起始日期 | Start Date

No response

实现PR | Implementation PR

启动方式:

python -m fastchat.serve.controller > controller.log 2>&1 &
python -m fastchat.serve.vllm_worker --model-path /home/qwen/lora/Qwen-7B-Chat-Int4/ --tensor-parallel-size 1 --trust-remote-code --dtype float16  > model_worker.log 2>&1 &
python -m fastchat.serve.openai_api_server --host 0.0.0.0 --port 8001 > api_server.log 2>&1 &

调用:
image

相关Issues | Reference Issues

No response

摘要 | Summary

基本示例 | Basic Example

缺陷 | Drawbacks

worker报错:
image
api服务报错:
image

未解决问题 | Unresolved questions

请问这是什么原因导致的呢?

@xiaohaiqing xiaohaiqing added the question Further information is requested label Mar 13, 2024
@jklj077
Copy link
Contributor

jklj077 commented Mar 13, 2024

For the error raised by the worker, you may need to downgrade fschat<0.2.36 and vllm<0.2.7. Unfortunately, FastChat 0.2.36 adopts a quick but dirty way to realize compatibility with vLLM 0.2.7 (https://github.com/lm-sys/FastChat/blob/b21d0f780ca4472a13714262a0790f2ee1ade659/fastchat/serve/vllm_worker.py#L60). As QwenTokenizer uses custom code, the change in FastChat introduces an unexpected behaviour (seemingly only) for Qwen.

For the error raised by the api server, it is similar to #1062

@jklj077 jklj077 closed this as completed Mar 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants