You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I am planning to host the Qwen2-VL-7B-Instruct model as a server on an EC2 instance. I would like to know the recommended system specifications for running this model efficiently. Additionally, I am exploring whether using vLLMs is the best approach for deploying this model in a production environment. I plan to use this model primarily for OCR tasks and multimodal inference, processing both images and text in real-time.
Also, would you recommend vLLMs for deploying Qwen2-VL-7B-Instruct as a server for inference?
The text was updated successfully, but these errors were encountered:
I am planning to host the Qwen2-VL-7B-Instruct model as a server on an EC2 instance. I would like to know the recommended system specifications for running this model efficiently. Additionally, I am exploring whether using vLLMs is the best approach for deploying this model in a production environment. I plan to use this model primarily for OCR tasks and multimodal inference, processing both images and text in real-time.
Also, would you recommend vLLMs for deploying Qwen2-VL-7B-Instruct as a server for inference?
The text was updated successfully, but these errors were encountered: