Skip to content

Commit

Permalink
fix
Browse files Browse the repository at this point in the history
Signed-off-by: zhilong <[email protected]>
  • Loading branch information
Bye-legumes committed Dec 12, 2024
1 parent 65c567b commit 2d78ee0
Showing 1 changed file with 7 additions and 6 deletions.
13 changes: 7 additions & 6 deletions python/ray/data/datasource/file_based_datasource.py
Original file line number Diff line number Diff line change
Expand Up @@ -89,12 +89,13 @@ class FileShuffleConfig:
>>> current_dir = Path(os.getcwd())
>>> # Create temporary Parquet files for testing in the current directory
>>> paths = [current_dir / f"test_file_{i}.parquet" for i in range(5)]
>>> for i, path in enumerate(paths):
>>> # Write dummy Parquet files
>>> data = {'col1': range(10 * i, 10 * (i + 1)),
>>> 'col2': ['foo', 'bar'] * 5}
>>> table = pa.Table.from_pydict(data)
>>> pq.write_table(table, path)
>>> #Write dummy Parquet files
>>> table1 = pa.Table.from_pydict({'col1': range(10)})
>>> table2 = pa.Table.from_pydict({'col1': range(10,20)})
>>> table3 = pa.Table.from_pydict({'col1': range(20,30)})
>>> pq.write_table(table1, paths[0])
>>> pq.write_table(table2, paths[1])
>>> pq.write_table(table3, paths[2])
>>> # Convert paths to strings for read_parquet
>>> string_paths = [str(path) for path in paths]
>>> # Read with deterministic shuffling
Expand Down

0 comments on commit 2d78ee0

Please sign in to comment.