1-Click installers for Windows, RunPod, Massed Compute and ultra advanced CogVLM 2 Batch Processing Gradio APP with 4-bit quantization #212

FurkanGozukara · 2025-01-17T15:06:59Z

Great work THUDM team thank for this amazing model

Check below screenshots to see how to use it

Currently the APP works amazing with 4-bit quantization very fast

I am searching to lower VRAM usage even further with like adding CPU-Offloading and other stuff if possible

Previously we were lacking Triton but it now works perfect

My installer installs into a Python 3.10 VENV completely isolated and clean

You can see entire APP and installer source code

If you get Triton error make sure to delete your Triton cache after installing the app like below

C:\Users\Furkan.triton

Provide feedback