-
Notifications
You must be signed in to change notification settings - Fork 814
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature] Support minicpmv v2.6 #2785
Conversation
Questions
|
|
42f09c0
to
c2abb42
Compare
@mickqian Looks nice! Are there anything left to do? |
from an assertion of checking the input_ids' length. The token limit (from my understanding) is calculated from the GPU: if len(req.origin_input_ids) >= self.max_req_input_len:
logger.error(
"Multimodal prompt is too long after expanding multimodal tokens. "
f"After expanding {len(req.origin_input_ids_unpadded)=} => {len(req.origin_input_ids)} >= {self.max_req_input_len}. "
)
|
aef5497
to
fcf2ddd
Compare
@hnyls2002 Could you help to review. Thanks! |
424c100
to
001d017
Compare
@mickqian Thanks. We will update the API key as soon as possible. Before that, could you fix the conflicts of |
36d213e
to
a382656
Compare
LGTM cc @merrymercy |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Can you add some unit tests like this to compare the logtis against HF implementation? https://github.com/sgl-project/sglang/pull/2365/files#diff-6e52783df34170e0b3d9aadbd7c338d9c16f0303c18846afc1d298c32d4a4eb2R1
- Can you help update the docs here for VLM?
## How to Support a New Model
from vllm.distributed import divide, get_tensor_model_parallel_world_size | ||
from vllm.model_executor.layers.resampler import get_2d_sincos_pos_embed | ||
from vllm.model_executor.layers.sampler import SamplerOutput, get_sampler | ||
from vllm.model_executor.models.module_mapping import MultiModelKeys | ||
from vllm.model_executor.sampling_metadata import SamplingMetadata |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do not import anything from vllm.
SamplingMetadata
,SamplerOutput
are not used.- Use sglang.srt.distributed
- copy over small utility functions.
@mickqian Also, remove Chinese comments in the PR. Thanks so much. Could we try not import any vllm dependency, rather rewrite it ourselves. |
Under the models files, we prefer not to import anything from vllm. We will remove them all later. |
Trying to address these in #2977 |
Motivation
Addressing #2461
Modifications
MiniCPMV
and corresponding processorMiniCPMVImageProcessor
Checklist