[RLlib; Offline RL] 2. Multiple optimizations for streaming data. #49195

simonsays1980 · 2024-12-10T18:14:05Z

Why are these changes needed?

Streaming data directly from cloud storage showed still low performance. This PR comes with multiple optimizations to train Offline RL agents in a truly streaming way, i.e. starting training after the first chunks of data are read and preprocessed. The following changes were made:

Removing schema calls.
Removing customizations of data block numbers.
Removing default filesystems.
Removing locality_hints from OfflinePreLearner as this is not used anymore.
Removing local shuffling of data.
Removing corresponding customizations from all our tuned_examples and tests.
Changing the RLUnplugged example to
- Read converted Parquet data to demonstrate the powerful streaming capabilities of Ray Data in our Offline RL API
- Providing a SPREAD scheduling strategy when using multiple learners on a multi-node cluster.
- Removing all customizations that slow down data processing.
- Fixing multiple bugs and improving performance in the custom ConnectorV2 to read stacked and encoded Atari frames

Related issue number

Related to #49194

Checks

I've signed off every commit(by using the -s flag, i.e., git commit -s) in this PR.
I've run scripts/format.sh to lint the changes in this PR.
I've included any doc changes needed for https://docs.ray.io/en/master/.
- I've added any new APIs to the API Reference. For example, if I added a
  method in Tune, I've added it in doc/source/tune/api/ under the
  corresponding .rst file.
I've made sure the tests are passing. Note that there might be a few flaky tests, see the recent failures at https://flakey-tests.ray.io/
Testing Strategy
- Unit tests
- Release tests
- This PR is not tested :(

…, removed default read arguments in 'OfflineData' to always use Ray Data's optimized reads (specifically on the product). Moved call to schema to debug logging in 'OfflineData' to avoid any further overhead when loading a dataset. Signed-off-by: simonsays1980 <[email protected]>

Signed-off-by: simonsays1980 <[email protected]>

…urthermore, removed learner and locality hints from multi-learner setups b/c we do not need them for the time being and actor handles cannot get serialized. Signed-off-by: simonsays1980 <[email protected]>

…Furthermore, removed 'num_cpus' from 'map_batches' as it was blocking execution with multi-learner multi-node setups. In addition modified all #args.num_learner' usages. Signed-off-by: simonsays1980 <[email protected]>

…educe overhead costs. Signed-off-by: simonsays1980 <[email protected]>

…ws iterating over batches down as long as it is not fixed in Ray Data. Signed-off-by: simonsays1980 <[email protected]>

Signed-off-by: simonsays1980 <[email protected]>

…d anymore. Signed-off-by: simonsays1980 <[email protected]>

Signed-off-by: simonsays1980 <[email protected]>

rllib/offline/offline_prelearner.py

rllib/tuned_examples/bc/cartpole_bc.py

rllib/tuned_examples/bc/benchmark_rlunplugged_atari_pong_bc.py

sven1977 · 2024-12-10T19:44:46Z

rllib/offline/offline_data.py

@@ -146,8 +147,7 @@ def sample(
                # Add constructor `kwargs` when using remote learners.
                fn_constructor_kwargs.update(
                    {
-                        "learner": self.learner_handles,
-                        "locality_hints": self.locality_hints,
+                        "learner": None,


Wait, what changed here?

Locality hints are gone. The learner will go in a third PR blocked by the Ray Data w/ Ray Tune issue

sven1977

Looks good. Just a handful of questions ...

…s and modified tests. Signed-off-by: simonsays1980 <[email protected]>

…need it anymore. Signed-off-by: simonsays1980 <[email protected]>

…guments in the 'OfflinePreLearner'. Signed-off-by: simonsays1980 <[email protected]>

… b/c test was failing. Signed-off-by: simonsays1980 <[email protected]>

sven1977 · 2024-12-13T15:02:47Z

rllib/offline/tests/test_offline_prelearner.py

@@ -193,7 +192,7 @@ def test_offline_prelearner_sample_from_old_sample_batch_data(self):
        algo = self.config.build()
        # Build the `OfflinePreLearner` and add the learner.
        oplr = OfflinePreLearner(
-            self.config,
+            config=self.config,


sven1977 · 2024-12-13T15:02:52Z

rllib/offline/tests/test_offline_prelearner.py

@@ -74,7 +74,7 @@ def test_offline_prelearner_buffer_class(self):
        algo = self.config.build()
        # Build the `OfflinePreLearner` and add the learner.
        oplr = OfflinePreLearner(
-            self.config,
+            config=self.config,


sven1977 · 2024-12-13T15:02:57Z

rllib/offline/offline_prelearner.py

@@ -84,9 +84,9 @@ class OfflinePreLearner:
    @OverrideToImplementCustomLogic_CallToSuperRecommended
    def __init__(
        self,
+        *,


sven1977

LGTM now. Thanks @simonsays1980 !

…y-project#49195) Signed-off-by: ujjawal-khare <[email protected]>

simonsays1980 added 7 commits November 8, 2024 14:47

Adapted all tuned examples to the optimizations made in this branch.

df8afbe

Signed-off-by: simonsays1980 <[email protected]>

Removed call for schema completely to not hurt perforamance anyhow. F…

5a179ae

…urthermore, removed learner and locality hints from multi-learner setups b/c we do not need them for the time being and actor handles cannot get serialized. Signed-off-by: simonsays1980 <[email protected]>

Changed dataset iterations per learner to a higher value to improve r…

af757e8

…educe overhead costs. Signed-off-by: simonsays1980 <[email protected]>

Removed local shuffle buffer from 'OfflineData''s defaults b/c it slo…

716fa69

…ws iterating over batches down as long as it is not fixed in Ray Data. Signed-off-by: simonsays1980 <[email protected]>

Merged master and resolved conflicts.

db07517

Signed-off-by: simonsays1980 <[email protected]>

simonsays1980 added enhancement Request for new feature and/or capability rllib RLlib related issues rllib-offline-rl Offline RL problems labels Dec 10, 2024

simonsays1980 marked this pull request as ready for review December 10, 2024 18:18

simonsays1980 requested a review from sven1977 as a code owner December 10, 2024 18:18

simonsays1980 added 2 commits December 10, 2024 19:38

Removed 'locality_hints' from 'OfflinePreLearner' b/c it is not neede…

c296b56

…d anymore. Signed-off-by: simonsays1980 <[email protected]>

Removed unused customization in 'RLUnplugged' example.

1d9e02c

Signed-off-by: simonsays1980 <[email protected]>