[Feature] adapt fused sigmoid gate for MoE model #2739

zhyncs · 2025-01-05T16:55:21Z

Checklist

1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.
2. Please use English, otherwise it will be closed.

Motivation

ref https://github.com/NVIDIA/TensorRT-LLM/blob/be1788106245496872d18e702978e59b6bfd50e0/cpp/tensorrt_llm/kernels/mixtureOfExperts/moe_kernels.cu#L232

Related resources

No response

zhaochenyang20 · 2025-01-05T18:39:19Z

@NovTi

sitabulaixizawaluduo · 2025-01-07T01:49:19Z

Checklist

1. If the issue you raised is not a feature but a question, please raise a discussion at https://github.com/sgl-project/sglang/discussions/new/choose Otherwise, it will be closed.

2. Please use English, otherwise it will be closed.

Motivation

ref https://github.com/NVIDIA/TensorRT-LLM/blob/be1788106245496872d18e702978e59b6bfd50e0/cpp/tensorrt_llm/kernels/mixtureOfExperts/moe_kernels.cu#L232

Related resources

No response

LMDeploy should also do the same optimization, which should greatly improve the performance of MoE

zhyncs added good first issue Good for newcomers performance labels Jan 5, 2025

zhyncs assigned zhaochenyang20 Jan 5, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Feature] adapt fused sigmoid gate for MoE model #2739

[Feature] adapt fused sigmoid gate for MoE model #2739

zhyncs commented Jan 5, 2025

zhaochenyang20 commented Jan 5, 2025

sitabulaixizawaluduo commented Jan 7, 2025

Checklist

Motivation

Related resources

[Feature] adapt fused sigmoid gate for MoE model #2739

[Feature] adapt fused sigmoid gate for MoE model #2739

Comments

zhyncs commented Jan 5, 2025

Checklist

Motivation

Related resources

zhaochenyang20 commented Jan 5, 2025

sitabulaixizawaluduo commented Jan 7, 2025

Checklist

Motivation

Related resources