v1.1.0
ChainerMN 1.1.0 release notes
ChainerMN is a multi node extension of a deep learning framework Chainer, add scalability over 1000 GPUs. 1.1.0 release is a minor update that adds several enhancements and bug fixes to 1.0, and supports latest Chainer release.
New experimental features include multi-node checkpointing and resuming. It also has several enhancements on DataSet distribution, supporting dynamically changing networks, It adds support to latest Chainer 3.2.0 and drops support on older Chainer versions such as 1.x and 2.x series. Also, pure_nccl
communicator is now generally available and most recommended communicator.
bugfix
enhancement
- Support a wider range of dynamically initialized models for MultiNodeOptimizer (#148)
- Remove outdated cudnn variable to make compatible with CuPy v4 (#147, thanks @tkerola!)
- Avoid sending SubDataset and use broadcast for datasets (#140)
- Support tuple data communication (#139)
- Chainer v3 support (#123)
feature
pure_nccl
communicator is now generally available (#165)- Add simple and distributed checkpointing and automatic recovery (#144)
- Support all-to-all (#135)
document
- Update supported Chainer version in the document (#162)
installation
- Update docs and add cupy as requirement (#171)
example
test
- Fix a bug of point to point with GPU (#174)
- Pass unit tests more than 3 processes (#172)
- Refactor test directory structure to align Chainer's test dir (#169)
- Move from nose to pytest (#167)
- Refactor
tests
directory (#155) - Reduce the number of procs of MPI test for robust CI (#136)
- Add Chainer v3 Test to Travis CI (#141)