Skip to content

Actions: microsoft/DeepSpeed

nv-lightning-v100

Actions

Loading...
Loading

Show workflow options

Create status badge

Loading
5,010 workflow runs
5,010 workflow runs

Filter by Event

Filter by Status

Filter by Branch

Filter by Actor

Use ds-specific module id to avoid conflicts
nv-lightning-v100 #13912: Pull request #6847 synchronize by loadams
January 6, 2025 16:12 3m 57s olruwase/pr_6772
January 6, 2025 16:12 3m 57s
nv-lightning-v100
nv-lightning-v100 #13911: Scheduled
January 6, 2025 00:22 3m 57s master
January 6, 2025 00:22 3m 57s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-lightning-v100 #13910: Pull request #6909 synchronize by hj-wei
January 5, 2025 10:52 6m 59s hj-wei:dev_hjwei
January 5, 2025 10:52 6m 59s
nv-lightning-v100
nv-lightning-v100 #13909: Scheduled
January 5, 2025 00:23 3m 56s master
January 5, 2025 00:23 3m 56s
nv-lightning-v100
nv-lightning-v100 #13908: Merge group checks requested
January 4, 2025 05:58 6m 54s
January 4, 2025 05:58 6m 54s
Fix: forbid repeated deepspeed.initialize on training objects
nv-lightning-v100 #13907: Pull request #6874 synchronize by loadams
January 4, 2025 04:39 Action required traincheck-team:fix-6848-forbid-repeated-init
January 4, 2025 04:39 Action required
nv-lightning-v100
nv-lightning-v100 #13906: Scheduled
January 4, 2025 00:20 4m 2s master
January 4, 2025 00:20 4m 2s
Use ds-specific module id to avoid conflicts
nv-lightning-v100 #13905: Pull request #6847 synchronize by loadams
January 3, 2025 22:04 4m 25s olruwase/pr_6772
January 3, 2025 22:04 4m 25s
Add the missing view operations from sequence parallel(async).
nv-lightning-v100 #13904: Pull request #6750 synchronize by loadams
January 3, 2025 19:32 19m 28s inkcherry:ds_overlap_fix
January 3, 2025 19:32 19m 28s
Add fp8_gemm fallback for non-triton systems
nv-lightning-v100 #13902: Pull request #6916 synchronize by loadams
January 3, 2025 16:54 1h 47m 30s oelayan7:fp8_gemm_no_triton
January 3, 2025 16:54 1h 47m 30s
[BUG FIX]:fix get torch.version.cuda error when cuda is None in rocm
nv-lightning-v100 #13901: Pull request #6909 synchronize by loadams
January 3, 2025 16:28 1h 1m 15s hj-wei:dev_hjwei
January 3, 2025 16:28 1h 1m 15s
Fix checkpointable_layers Logic
nv-lightning-v100 #13900: Pull request #6881 synchronize by loadams
January 3, 2025 16:28 16m 59s Quentin-Anthony:qanthony/fix-act-recomp
January 3, 2025 16:28 16m 59s
nv-lightning-v100
nv-lightning-v100 #13899: Merge group checks requested
January 3, 2025 15:38 5m 35s
January 3, 2025 15:38 5m 35s
Support pure meta model lm_head tp
nv-lightning-v100 #13898: Pull request #6812 synchronize by delock
January 3, 2025 02:56 Action required Yejing-Lai:lyj/lm_head_replace
January 3, 2025 02:56 Action required
nv-lightning-v100
nv-lightning-v100 #13897: Scheduled
January 3, 2025 00:21 43m 47s master
January 3, 2025 00:21 43m 47s
Cleanup ops/transformer/inference tests
nv-lightning-v100 #13895: Pull request #6830 synchronize by loadams
January 2, 2025 18:47 2h 3m 36s loadams/transformers-inference
January 2, 2025 18:47 2h 3m 36s
Autotp training
nv-lightning-v100 #13893: Pull request #6922 synchronize by inkcherry
January 2, 2025 03:54 6m 34s inkcherry:autotp_training
January 2, 2025 03:54 6m 34s
nv-lightning-v100
nv-lightning-v100 #13892: Scheduled
January 2, 2025 00:20 5m 31s master
January 2, 2025 00:20 5m 31s
nv-lightning-v100
nv-lightning-v100 #13891: Scheduled
January 1, 2025 00:23 6m 51s master
January 1, 2025 00:23 6m 51s
Add fp8_gemm fallback for non-triton systems
nv-lightning-v100 #13890: Pull request #6916 synchronize by oelayan7
December 31, 2024 12:01 3m 3s oelayan7:fp8_gemm_no_triton
December 31, 2024 12:01 3m 3s
[inf] Add config var to enable keeping module on host
nv-lightning-v100 #13889: Pull request #6846 synchronize by oelayan7
December 31, 2024 07:32 6m 32s oelayan7:keep_module_on_host
December 31, 2024 07:32 6m 32s