Skip to content

feat(checkpoint): support universal checkpoint #1246

feat(checkpoint): support universal checkpoint

feat(checkpoint): support universal checkpoint #1246

Triggered via pull request December 23, 2024 08:20
Status Failure
Total duration 33m 19s
Artifacts

e2e_test.yaml

on: pull_request
training_4GPU
47s
training_4GPU
training_8GPU_ISP
37s
training_8GPU_ISP
training_8GPU_ISP_CKPT
37s
training_8GPU_ISP_CKPT
training_8GPU_4DP2PP_ZB
42s
training_8GPU_4DP2PP_ZB
Matrix: training_16GPU_4DP2TP2PP_FSP
Matrix: training_16GPU_4DP2TP2PP_MSP
Matrix: training_16GPU_4DP2TP2PP_MTP
Matrix: training_8GPU_4DP2PP
Matrix: training_8GPU_4DP2TP
Matrix: training_8GPU_4DP2TPSP
Matrix: training_llama2
Fit to window
Zoom out
Zoom in

Annotations

11 errors and 33 warnings
training_16GPU_4DP2TP2PP_FSP (t_cluster)
Process completed with exit code 143.
training_16GPU_4DP2TP2PP_MTP (t_cluster)
Process completed with exit code 143.
training_4GPU
Process completed with exit code 2.
training_8GPU_4DP2PP (t_cluster)
Process completed with exit code 2.
training_8GPU_4DP2PP_ZB
Process completed with exit code 143.
training_8GPU_ISP_CKPT
Process completed with exit code 143.
training_8GPU_ISP
Process completed with exit code 143.
training_8GPU_4DP2TPSP (t_cluster)
Process completed with exit code 143.
training_16GPU_4DP2TP2PP_MSP (t_cluster)
Process completed with exit code 143.
training_llama2 (t_cluster)
Process completed with exit code 2.
training_8GPU_4DP2TP (t_cluster)
Process completed with exit code 143.
training_16GPU_4DP2TP2PP_FSP (t_cluster)
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_16GPU_4DP2TP2PP_FSP (t_cluster)
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_16GPU_4DP2TP2PP_FSP (t_cluster)
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_16GPU_4DP2TP2PP_MTP (t_cluster)
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_16GPU_4DP2TP2PP_MTP (t_cluster)
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_16GPU_4DP2TP2PP_MTP (t_cluster)
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_4GPU
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_4GPU
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_4GPU
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_8GPU_4DP2PP (t_cluster)
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_8GPU_4DP2PP (t_cluster)
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_8GPU_4DP2PP (t_cluster)
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_8GPU_4DP2PP_ZB
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_8GPU_4DP2PP_ZB
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_8GPU_4DP2PP_ZB
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_8GPU_ISP_CKPT
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_8GPU_ISP_CKPT
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_8GPU_ISP_CKPT
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_8GPU_ISP
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_8GPU_ISP
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_8GPU_ISP
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_8GPU_4DP2TPSP (t_cluster)
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_8GPU_4DP2TPSP (t_cluster)
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_8GPU_4DP2TPSP (t_cluster)
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_16GPU_4DP2TP2PP_MSP (t_cluster)
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_16GPU_4DP2TP2PP_MSP (t_cluster)
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_16GPU_4DP2TP2PP_MSP (t_cluster)
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_llama2 (t_cluster)
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_llama2 (t_cluster)
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_llama2 (t_cluster)
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/
training_8GPU_4DP2TP (t_cluster)
This job failure may be caused by using an out of date self-hosted runner. You are currently using runner version 2.320.0. Please update to the latest version 2.321.0
training_8GPU_4DP2TP (t_cluster)
Node.js 16 actions are deprecated. Please update the following actions to use Node.js 20: actions/checkout@v3. For more information see: https://github.blog/changelog/2023-09-22-github-actions-transitioning-from-node-16-to-node-20/.
training_8GPU_4DP2TP (t_cluster)
The Actions runner will no longer support your OS version on November 1, 2024. Please upgrade to a supported version. For information, refer https://github.blog/changelog/2024-08-19-notice-of-upcoming-deprecations-and-breaking-changes-in-github-actions-runners/