Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问是否可以提供转成onnx的相关指导文档,谢谢 #259

Open
IeohMingChan opened this issue Nov 7, 2024 · 2 comments
Open

Comments

@IeohMingChan
Copy link

          您好,由于Reranker采用的是双向注意力,无kv cache机制,因此使用vllm部署并不会有较大的提升。您可以尝试转成onnx

Originally posted by @Kaguya-19 in #258 (comment)

@a20185
Copy link

a20185 commented Jan 13, 2025

同求,Optimum SDK 上似乎也没支持这个模型的 Onnx 的转换 o(╥﹏╥)o

@Kaguya-19
Copy link
Collaborator

可以先试试用llama架构转哈,我下周也来研究一下

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants