[Bug] recent change of param "block_size_x/y, unroll" in dlight/gpu/matmul.py significantly decrease q4f16_1 prefill speed on android 8gen3 device #1078
Triggered via issue
October 28, 2024 18:17
Status
Skipped
Total duration
4s
Artifacts
–