-
Notifications
You must be signed in to change notification settings - Fork 6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RLlib; docs] Docs do-over (new API stack): Rewrite/enhance "getting started" rst page. #49950
base: master
Are you sure you want to change the base?
Conversation
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started Signed-off-by: sven1977 <[email protected]> # Conflicts: # doc/source/rllib/rllib-training.rst
…_redo_getting_started Signed-off-by: sven1977 <[email protected]> # Conflicts: # doc/source/rllib/rllib-training.rst
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started
…_redo_getting_started
…_redo_getting_started
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started
Signed-off-by: sven1977 <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Some nits here and there. Great introduction for users into RLlib.
In this tutorial, you learn how to design, customize, and run an end-to-end RLlib learning experiment | ||
from scratch. This includes picking and configuring an :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`, | ||
running a couple of training iterations, saving the state of your | ||
:py:class:`~ray.rllib.algorithms.algorithm.Algorithm` from time to time, running a separate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome! This is what most people are looking for.
Python API | ||
~~~~~~~~~~ | ||
|
||
RLlib's Python API provides all the flexibility required for applying the library to any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we have any other API than the Python one?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope :D We got rid of the CLI, b/c of the maintenance burden, its stark limitations, and it being more or less a duplicate of a subset of what the python API could do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, we are working on the external access protocol for clients to connect to and communicate with RLlib, but that's heavily wip.
doc/source/rllib/getting-started.rst
Outdated
) | ||
|
||
|
||
To scale your setup and define, how many EnvRunner actors you want to leverage, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Shall we put all class names into ``?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we might want to add that these EnvRunner
s are used to rollout the policy and collect samples?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
.. testcode:: | ||
|
||
# Build the Algorithm (PPO). | ||
ppo = config.build_algo() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does build
still work?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, but you get a warning.
from pprint import pprint | ||
|
||
for _ in range(5): | ||
pprint(ppo.train()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice!
# Define your custom env class by subclassing gymnasium.Env: | ||
|
||
class ParrotEnv(gym.Env): | ||
"""Environment in which the agent learns to repeat the seen observations. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha! Awesome!
doc/source/rllib/getting-started.rst
Outdated
# Point your config to your custom env class: | ||
config = ( | ||
PPOConfig() | ||
.environment(ParrotEnv) # add `env_config=[some Box space] to customize the env |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe a missing " ` "?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done and clarified more. Also fixed the env accepting this suggested setting.
class CustomTorchRLModule(TorchRLModule): | ||
def setup(self): | ||
# You have access here to the following already set attributes: | ||
# self.observation_space |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great description!!
|
||
At the end of your script, RLlib evaluates the trained Algorithm: | ||
algo.stop() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haha. Yes that is needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might however show it explicitly as otherwise users might run into problems.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... in their own code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea. Will add a one-liner for this API.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
The `state` of an instantiated Algorithm can be retrieved by calling its | ||
`get_state` method. It contains all information necessary | ||
to create the Algorithm from scratch. No access to the original code (e.g. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this work now also with algorithms that had defined new attributes/methods? If the class is available it should imo.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think so. Users can decide to override the get_state/set_state
APIs to add more stateful stuff to their state-dicts, but the basic functionality (restoring EnvRunners, RLModule, Learner optimizer states, connector pipelines, etc..) works across all algos.
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started
Signed-off-by: sven1977 <[email protected]>
…_redo_metrics_logger Signed-off-by: sven1977 <[email protected]> # Conflicts: # doc/source/rllib/package_ref/algorithm.rst
Signed-off-by: sven1977 <[email protected]>
…_redo_getting_started
Docs do-over (new API stack): Rewrite/enhance "getting started" rst page.
rllib-training.html
togetting-started.html
...testcode
blocks.Why are these changes needed?
Related issue number
Checks
git commit -s
) in this PR.scripts/format.sh
to lint the changes in this PR.method in Tune, I've added it in
doc/source/tune/api/
under thecorresponding
.rst
file.