We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
您好,由于Reranker采用的是双向注意力,无kv cache机制,因此使用vllm部署并不会有较大的提升。您可以尝试转成onnx
Originally posted by @Kaguya-19 in #258 (comment)
The text was updated successfully, but these errors were encountered:
同求,Optimum SDK 上似乎也没支持这个模型的 Onnx 的转换 o(╥﹏╥)o
Sorry, something went wrong.
可以先试试用llama架构转哈,我下周也来研究一下
No branches or pull requests
Originally posted by @Kaguya-19 in #258 (comment)
The text was updated successfully, but these errors were encountered: