Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[RLlib] Docs do-over (new API stack): New MetricsLogger API rst page. #49538

Open
wants to merge 23 commits into
base: master
Choose a base branch
from

Conversation

sven1977
Copy link
Contributor

@sven1977 sven1977 commented Jan 2, 2025

Docs do-over (new API stack): New MetricsLogger API rst page.

Why are these changes needed?

Related issue number

Checks

  • I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
  • I've run scripts/format.sh to lint the changes in this PR.
  • I've included any doc changes needed for https://docs.ray.io/en/master/.
    • I've added any new APIs to the API Reference. For example, if I added a
      method in Tune, I've added it in doc/source/tune/api/ under the
      corresponding .rst file.
  • I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
  • Testing Strategy
    • Unit tests
    • Release tests
    • This PR is not tested :(

@sven1977 sven1977 requested review from maxpumperla, simonsays1980 and a team as code owners January 2, 2025 15:54
@sven1977 sven1977 changed the title [RLlib] Docs do-over (new API stack): New MetricsLogger API rst page. [WIP; RLlib] Docs do-over (new API stack): New MetricsLogger API rst page. Jan 2, 2025
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
…_redo_metrics_logger

Signed-off-by: sven1977 <[email protected]>

# Conflicts:
#	rllib/BUILD
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
@sven1977 sven1977 changed the title [WIP; RLlib] Docs do-over (new API stack): New MetricsLogger API rst page. [RLlib] Docs do-over (new API stack): New MetricsLogger API rst page. Jan 21, 2025
@sven1977 sven1977 added rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples rllib-newstack labels Jan 21, 2025
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Copy link
Collaborator

@simonsays1980 simonsays1980 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Just some nits.

.. note::
So far, RLlib components owning a :py:class:`~ray.rllib.utils.metrics_logger.MetricsLogger`
instance are :py:class:`~ray.rllib.algorithms.algorithm.Algorithm`, :py:class:`~ray.rllib.env.env_runner.EnvRunner`,
:py:class:`~ray.rllib.core.learner.learner.Learner`, and all :py:class:`~ray.rllib.connectors.connector_v2.ConnectorV2` classes.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

EpisodeReplayBuffers

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed


- Users can log scalar values over time, such as losses or rewards.
- Thereby, users can configure different reduction types, in particular ``mean``, ``min``, ``max``, ``sum``, or ``None``.
- Users can also specify sliding windows, over which these reductions take place, for example ``window=100`` to average over the
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe say something about clear_on_reduce and how aggregates are held in the history instead of accumulating large lists.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

which doesn't alter any underlying values.


Instead of providing a flat key, you can also log a value under some nested key through passing in a tuple:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice!

~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

In case you are logging a nested structure of values, for example
``{"time_s": 0.1, "lives": 5, "rounds_played": {"player1": 10, "player2": 4}}`` and all values have the exact same log settings
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A dict that is for any reason empty, will be reduced away, which is for collecting result dicts for a table is challenging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, let me try this right now and fix, if possible ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, an empty dict is not logged at all. Meaning ...

metrics.log_dict({}, key="abc")

... results in metrics not undergoing any changes, which I think makes a lot of sense. There are simply no keys to be logged and the underlying tree.map_structure doesn't have any nested structure to loop through.




Timing things
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Timing processes"?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fine, renamed into Timing code blocks. :D

# EMA should be ~1.1sec.
assert 1.15 > logger.peek("my_block_to_be_timed") > 1.05

Also note that the default logging behavior is through exponential mean averaging (EMA), with a default coefficient of 0.01.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe describing how to use a simple mean instead?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, we don't support this yet :D
You can do a simple mean over a window, yes, but not over all lifetime, b/c we don't have a proper mechanism yet to keep the list of values small. You could do logger.log_value([key], [value], reduce="mean", window=float("inf")), but that would cause a mem leak at some point. We need to fix this by keeping track of the number of items logged thus far and then keeping the internal list smaller, but still providing the proper lifetime average. To be completely honest, not sure how useful a lifetime average would be (as opposed to an EMA, which makes more sense to report). But we should fix this either way ...

Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
…_redo_metrics_logger

Signed-off-by: sven1977 <[email protected]>

# Conflicts:
#	doc/source/rllib/package_ref/algorithm.rst
Signed-off-by: sven1977 <[email protected]>
Signed-off-by: sven1977 <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
rllib RLlib related issues rllib-docs-or-examples Issues related to RLlib documentation or rllib/examples rllib-newstack
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants