Skip to content

fix(hybrid optim): fp32_grad not scaled when use offload_cpu #1261

fix(hybrid optim): fp32_grad not scaled when use offload_cpu

fix(hybrid optim): fp32_grad not scaled when use offload_cpu #1261

Annotations

2 warnings

training_16GPU_4DP2TP2PP_FSP (t_cluster)

succeeded Jan 1, 2025 in 1m 31s