v0.5.0

PaParaZz1 released this 05 Dec 05:04

· 89 commits to main since this release

4f8f82a

Env

add tabmwp env (#667)
polish anytrading env issues (#731)

Algorithm

add PromptPG algorithm (#667)
add Plan Diffuser algorithm (#700) (#749)
add new pipeline implementation of IMPALA algorithm (#713)
add dropout layers to DQN-style algorithms (#712)

Enhancement

add new pipeline agent for sac/ddpg/a2c/ppo and Hugging Face support (#637) (#730) (#737)
add more unittest cases for model (#728)
add collector logging in new pipeline (#735)

Fix

fix logger middleware problems (#715)
fix ppo parallel bug (#709)
fix typo in optimizer_helper.py (#726)
fix mlp dropout if condition bug
fix drex collecting data unittest bugs

Style

polish env manager/wrapper comments and API doc (#742)
polish model comments and API doc (#722) (#729) (#734) (#736) (#741)
polish policy comments and API doc (#732)
polish rl_utils comments and API doc (#724)
polish torch_utils comments and API doc (#738)
update README.md and Colab demo (#733)
update metaworld docker image

News

NeurIPS 2023 Spotlight: LightZero: A Unified Benchmark for Monte Carlo Tree Search in General Sequential Decision Scenarios
OpenDILab + Hugging Face DRL Model Zoo link

Full Changelog: v0.4.9...v0.5.0

Contributors: @PaParaZz1 @zjowowen @AltmanD @puyuan1996 @kxzxvbk @Super1ce @nighood @Cloud-Pku @zhangpaipai @ruoyuGao @eltociear

Contributors

eltociear, AltmanD, and 9 other contributors

Assets 2