Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the option to select the openclip model #284

Closed
wants to merge 3 commits into from

Conversation

barinov274
Copy link

@barinov274 barinov274 commented Jun 11, 2023

There are quite a few openclip models, but I need specifically laion/CLIP-ViT-L-14-DataComp.XL-s13B-b90K
I looked at the load_model function, it parses the --clip_model argument, and if the string starts with open_clip:, then using the load_open_clip function, which actually loads the openclip model

        clip_model = clip_model[len("open_clip:") :]
        model, preprocess = load_open_clip(clip_model, use_jit, device, clip_cache_path)

That's all great, but then I would expect to see some sort of parsing line after open_clip:, like, after for example ViT-L-14, there should be a specification of which model I want to download.
But instead, I saw this.

pretrained = dict(open_clip.list_pretrained())
    checkpoint = pretrained[clip_model]
    model, _, preprocess = open_clip.create_model_and_transforms(
        clip_model, pretrained=checkpoint, device=device, jit=use_jit, cache_dir=clip_cache_path
    )

So, the user is downloaded to the computer a random model that the user has no idea about, and he doesn't even have the ability to choose the model that he wants.
And you can see the wide variety of models that the open_clip library offers:

 ('RN50', 'yfcc15m'),
 ('RN50', 'cc12m'),
 ('RN50-quickgelu', 'openai'),
 ('RN50-quickgelu', 'yfcc15m'),
 ('RN50-quickgelu', 'cc12m'),
 ('RN101', 'openai'),
 ('RN101', 'yfcc15m'),
 ('RN101-quickgelu', 'openai'),
 ('RN101-quickgelu', 'yfcc15m'),
 ('RN50x4', 'openai'),
 ('RN50x16', 'openai'),
 ('RN50x64', 'openai'),
 ('ViT-B-32', 'openai'),
 ('ViT-B-32', 'laion400m_e31'),
 ('ViT-B-32', 'laion400m_e32'),
 ('ViT-B-32', 'laion2b_e16'),
 ('ViT-B-32', 'laion2b_s34b_b79k'),
 ('ViT-B-32', 'datacomp_m_s128m_b4k'),
 ('ViT-B-32', 'commonpool_m_clip_s128m_b4k'),
 ('ViT-B-32', 'commonpool_m_laion_s128m_b4k'),
 ('ViT-B-32', 'commonpool_m_image_s128m_b4k'),
 ('ViT-B-32', 'commonpool_m_text_s128m_b4k'),
 ('ViT-B-32', 'commonpool_m_basic_s128m_b4k'),
 ('ViT-B-32', 'commonpool_m_s128m_b4k'),
 ('ViT-B-32', 'datacomp_s_s13m_b4k'),
 ('ViT-B-32', 'commonpool_s_clip_s13m_b4k'),
 ('ViT-B-32', 'commonpool_s_laion_s13m_b4k'),
 ('ViT-B-32', 'commonpool_s_image_s13m_b4k'),
 ('ViT-B-32', 'commonpool_s_text_s13m_b4k'),
 ('ViT-B-32', 'commonpool_s_basic_s13m_b4k'),
 ('ViT-B-32', 'commonpool_s_s13m_b4k'),
 ('ViT-B-32-quickgelu', 'openai'),
 ('ViT-B-32-quickgelu', 'laion400m_e31'),
 ('ViT-B-32-quickgelu', 'laion400m_e32'),
 ('ViT-B-16', 'openai'),
 ('ViT-B-16', 'laion400m_e31'),
 ('ViT-B-16', 'laion400m_e32'),
 ('ViT-B-16', 'laion2b_s34b_b88k'),
 ('ViT-B-16', 'datacomp_l_s1b_b8k'),
 ('ViT-B-16', 'commonpool_l_clip_s1b_b8k'),
 ('ViT-B-16', 'commonpool_l_laion_s1b_b8k'),
 ('ViT-B-16', 'commonpool_l_image_s1b_b8k'),
 ('ViT-B-16', 'commonpool_l_text_s1b_b8k'),
 ('ViT-B-16', 'commonpool_l_basic_s1b_b8k'),
 ('ViT-B-16', 'commonpool_l_s1b_b8k'),
 ('ViT-B-16-plus-240', 'laion400m_e31'),
 ('ViT-B-16-plus-240', 'laion400m_e32'),
 ('ViT-L-14', 'openai'),
 ('ViT-L-14', 'laion400m_e31'),
 ('ViT-L-14', 'laion400m_e32'),
 ('ViT-L-14', 'laion2b_s32b_b82k'),
 ('ViT-L-14', 'datacomp_xl_s13b_b90k'),
 ('ViT-L-14', 'commonpool_xl_clip_s13b_b90k'),
 ('ViT-L-14', 'commonpool_xl_laion_s13b_b90k'),
 ('ViT-L-14', 'commonpool_xl_s13b_b90k'),
 ('ViT-L-14-336', 'openai'),
 ('ViT-H-14', 'laion2b_s32b_b79k'),
 ('ViT-g-14', 'laion2b_s12b_b42k'),
 ('ViT-g-14', 'laion2b_s34b_b88k'),
 ('ViT-bigG-14', 'laion2b_s39b_b160k'),
 ('roberta-ViT-B-32', 'laion2b_s12b_b32k'),
 ('xlm-roberta-base-ViT-B-32', 'laion5b_s13b_b90k'),
 ('xlm-roberta-large-ViT-H-14', 'frozen_laion5b_s13b_b90k'),
 ('convnext_base', 'laion400m_s13b_b51k'),
 ('convnext_base_w', 'laion2b_s13b_b82k'),
 ('convnext_base_w', 'laion2b_s13b_b82k_augreg'),
 ('convnext_base_w', 'laion_aesthetic_s13b_b82k'),
 ('convnext_base_w_320', 'laion_aesthetic_s13b_b82k'),
 ('convnext_base_w_320', 'laion_aesthetic_s13b_b82k_augreg'),
 ('convnext_large_d', 'laion2b_s26b_b102k_augreg'),
 ('convnext_large_d_320', 'laion2b_s29b_b131k_ft'),
 ('convnext_large_d_320', 'laion2b_s29b_b131k_ft_soup'),
 ('convnext_xxlarge', 'laion2b_s34b_b82k_augreg'),
 ('convnext_xxlarge', 'laion2b_s34b_b82k_augreg_rewind'),
 ('convnext_xxlarge', 'laion2b_s34b_b82k_augreg_soup'),
 ('coca_ViT-B-32', 'laion2b_s13b_b90k'),
 ('coca_ViT-B-32', 'mscoco_finetuned_laion2b_s13b_b90k'),
 ('coca_ViT-L-14', 'laion2b_s13b_b90k'),
 ('coca_ViT-L-14', 'mscoco_finetuned_laion2b_s13b_b90k'),
 ('EVA01-g-14', 'laion400m_s11b_b41k'),
 ('EVA01-g-14-plus', 'merged2b_s11b_b114k'),
 ('EVA02-B-16', 'merged2b_s8b_b131k'),
 ('EVA02-L-14', 'merged2b_s4b_b131k'),
 ('EVA02-L-14-336', 'merged2b_s6b_b61k'),
 ('EVA02-E-14', 'laion2b_s4b_b115k'),
 ('EVA02-E-14-plus', 'laion2b_s9b_b144k')]

I propose a commit
So you can choose a model by typing
clip-retrieval inference --clip_model "open_clip:ViT-L-14 | datacomp_xl_s13b_b90k" ...
And even if you don't set the checkpoint after "|", you'll get a line about the model
print(f"Loading OpenClip model{model} with {checkpoint} checkpoint")

ability to select specific version of openclip

add option to select the opencli model
Copy link
Owner

@rom1504 rom1504 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please adapt the doc (in the readme)

@rom1504
Copy link
Owner

rom1504 commented Jun 14, 2023

Can you fix the lint ?

And maybe add one test case there https://github.com/barinov274/clip-retrieval/blob/patch-1/tests/test_clip_inference/test_mapper.py#L9

@raunakdoesdev
Copy link

pls merge

@rom1504
Copy link
Owner

rom1504 commented Jan 6, 2024

that seems important but there is no test

also I am not convinced about the " | " syntax. ";" may be better

spaces have bad properties for shell arguments

@rom1504
Copy link
Owner

rom1504 commented Jan 6, 2024

thanks, I merged this into #314

@rom1504 rom1504 closed this Jan 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Closed
Development

Successfully merging this pull request may close these issues.

3 participants