๐ฏ
#pragma unroll
๐คLLM/VLM | Diffusion | CUDA | AI Infra
-
Statistics Department of JNU
- Guangzhou, China
-
02:35
(UTC +08:00) - https://github.com/DefTruth
- https://www.zhihu.com/people/qyjdef
Pinned Loading
-
lite.ai.toolkit
lite.ai.toolkit Public๐ A lite C++ toolkit of 100+ Awesome AI models, support ORT, MNN, NCNN, TNN and TensorRT. ๐๐
-
vllm-project/vllm
vllm-project/vllm PublicA high-throughput and memory-efficient inference and serving engine for LLMs
-
Awesome-LLM-Inference
Awesome-LLM-Inference Public๐A curated list of Awesome LLM/VLM Inference Papers with codes, such as FlashAttention, PagedAttention, Parallelism, etc. ๐๐
-
CUDA-Learn-Notes
CUDA-Learn-Notes Public๐150+ Tensor/CUDA Cores Kernels, โก๏ธflash-attn-mma, โก๏ธhgemm with WMMA, MMA and CuTe (98%~100% TFLOPS of cuBLAS/FA2 ๐๐).
-
Awesome-Diffusion-Inference
Awesome-Diffusion-Inference Public๐A curated list of Awesome Diffusion Inference Papers with codes, such as Sampling, Caching, Multi-GPUs, etc. ๐๐
Something went wrong, please refresh the page to try again.
If the problem persists, check the GitHub status page or contact support.
If the problem persists, check the GitHub status page or contact support.