v0.5.0
Env
Algorithm
- add PromptPG algorithm (#667)
- add Plan Diffuser algorithm (#700) (#749)
- add new pipeline implementation of IMPALA algorithm (#713)
- add dropout layers to DQN-style algorithms (#712)
Enhancement
- add new pipeline agent for sac/ddpg/a2c/ppo and Hugging Face support (#637) (#730) (#737)
- add more unittest cases for model (#728)
- add collector logging in new pipeline (#735)
Fix
- fix logger middleware problems (#715)
- fix ppo parallel bug (#709)
- fix typo in optimizer_helper.py (#726)
- fix mlp dropout if condition bug
- fix drex collecting data unittest bugs
Style
- polish env manager/wrapper comments and API doc (#742)
- polish model comments and API doc (#722) (#729) (#734) (#736) (#741)
- polish policy comments and API doc (#732)
- polish rl_utils comments and API doc (#724)
- polish torch_utils comments and API doc (#738)
- update README.md and Colab demo (#733)
- update metaworld docker image
News
- NeurIPS 2023 Spotlight: LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
- OpenDILab + Hugging Face DRL Model Zoo link
Full Changelog: v0.4.9...v0.5.0
Contributors: @PaParaZz1 @zjowowen @AltmanD @puyuan1996 @kxzxvbk @Super1ce @nighood @Cloud-Pku @zhangpaipai @ruoyuGao @eltociear