Bump stable-baselines3 from 2.2.1 to 2.3.2 #418

dependabot · 2024-04-29T03:19:13Z

Bumps stable-baselines3 from 2.2.1 to 2.3.2.

Release notes

Sourced from stable-baselines3's releases.

Stable-Baselines3 v2.3.0: New defaults hyperparameters for DDPG, TD3 and DQN

[!WARNING] Because of weights_only=True, this release breaks loading of policies when using PyTorch 1.13. Please upgrade to PyTorch >= 2.0 or upgrade SB3 version (we reverted the change in SB3 2.3.2)

SB3 Contrib (more algorithms): https://github.com/Stable-Baselines-Team/stable-baselines3-contrib RL Zoo3 (training framework): https://github.com/DLR-RM/rl-baselines3-zoo Stable-Baselines Jax (SBX): https://github.com/araffin/sbx

To upgrade:
pip install stable_baselines3 sb3_contrib --upgrade
or simply (rl zoo depends on SB3 and SB3 contrib):
pip install rl_zoo3 --upgrade
Breaking Changes:

The defaults hyperparameters of TD3 and DDPG have been changed to be more consistent with SAC
  # SB3 < 2.3.0 default hyperparameters
  # model = TD3("MlpPolicy", env, train_freq=(1, "episode"), gradient_steps=-1, batch_size=100)
  # SB3 >= 2.3.0:
  model = TD3("MlpPolicy", env, train_freq=1, gradient_steps=1, batch_size=256)
[!NOTE] Two inconsistencies remain: the default network architecture for TD3/DDPG is [400, 300] instead of [256, 256] for SAC (for backward compatibility reasons, see report on the influence of the network size ) and the default learning rate is 1e-3 instead of 3e-4 for SAC (for performance reasons, see W&B report on the influence of the lr )

The default learning_starts parameter of DQN have been changed to be consistent with the other offpolicy algorithms
  # SB3 < 2.3.0 default hyperparameters, 50_000 corresponded to Atari defaults hyperparameters
  # model = DQN("MlpPolicy", env, learning_starts=50_000)
  # SB3 >= 2.3.0:
  model = DQN("MlpPolicy", env, learning_starts=100)
For safety, torch.load() is now called with weights_only=True when loading torch tensors, policy load() still uses weights_only=False as gymnasium imports are required for it to work

When using huggingface_sb3, you will now need to set TRUST_REMOTE_CODE=True when downloading models from the hub, as pickle.load is not safe.

... (truncated)

Commits

285e01f Hotfix: revert loading with weights_only=True (#1913)
35eccaf Fix tensorboad video slow numpy->torch conversion (#1910)
e931750 Adding ER-MRL to community project (#1904)
4af4a32 Update RL Tips and Tricks section
9a74938 Cast learning_rate to float lambda for pickle safety when doing model.load (#...
5623d98 Fixed broken link in ppo.rst (#1884)
40ba504 Fix typo in changelog (#1882)
429be93 Release v2.3.0 (#1879)
071226d Log success rate for on policy algorithms (#1870)
8b3723c Update ruff and documentation for hf sb3 (#1866)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR:

@dependabot rebase will rebase this PR
@dependabot recreate will recreate this PR, overwriting any edits that have been made to it
@dependabot merge will merge this PR after your CI passes on it
@dependabot squash and merge will squash and merge this PR after your CI passes on it
@dependabot cancel merge will cancel a previously requested merge and block automerging
@dependabot reopen will reopen this PR if it is closed
@dependabot close will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually
@dependabot show <dependency name> ignore conditions will show all of the ignore conditions of the specified dependency
@dependabot ignore this major version will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this minor version will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself)
@dependabot ignore this dependency will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

Bumps [stable-baselines3](https://github.com/DLR-RM/stable-baselines3) from 2.2.1 to 2.3.2. - [Release notes](https://github.com/DLR-RM/stable-baselines3/releases) - [Commits](DLR-RM/stable-baselines3@v2.2.1...v2.3.2) --- updated-dependencies: - dependency-name: stable-baselines3 dependency-type: direct:production update-type: version-update:semver-minor ... Signed-off-by: dependabot[bot] <[email protected]>

dependabot bot added dependencies Pull requests that update a dependency file python Pull requests that update Python code labels Apr 29, 2024

dependabot bot mentioned this pull request Apr 29, 2024

Bump stable-baselines3 from 2.2.1 to 2.3.0 #400

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump stable-baselines3 from 2.2.1 to 2.3.2 #418

Bump stable-baselines3 from 2.2.1 to 2.3.2 #418

dependabot bot commented on behalf of github Apr 29, 2024

Bump stable-baselines3 from 2.2.1 to 2.3.2 #418

Are you sure you want to change the base?

Bump stable-baselines3 from 2.2.1 to 2.3.2 #418

Conversation

dependabot bot commented on behalf of github Apr 29, 2024

Stable-Baselines3 v2.3.0: New defaults hyperparameters for DDPG, TD3 and DQN

Breaking Changes: