Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rewrite the pathmap.Tree #637

Merged
merged 1 commit into from
Aug 28, 2024
Merged

Rewrite the pathmap.Tree #637

merged 1 commit into from
Aug 28, 2024

Conversation

Swatinem
Copy link
Contributor

@Swatinem Swatinem commented Aug 23, 2024

This mostly rewrites the Tree, making the following changes and optimizations:

  • Uses a real Node struct with children and terminals, instead of abusing special keys for it.
  • Avoids constructing needless non-terminal strings for all intermediate nodes.
  • Constructs the tree directly iteratively, instead of creating a parallel tree and merging recursively.
  • Switches from recursion to iteration for _drill. It should be possible to also avoid recursion in lookup, but with a bit more effort.

This should primarily improve construction performance and improve memory usage, which was the primary pain points with the previous implementation.

@Swatinem Swatinem requested a review from a team August 23, 2024 08:43
@Swatinem Swatinem self-assigned this Aug 23, 2024
@codecov-qa
Copy link

codecov-qa bot commented Aug 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.12%. Comparing base (df39f54) to head (2661202).
Report is 6 commits behind head on main.

✅ All tests successful. No failed tests found.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #637      +/-   ##
==========================================
- Coverage   98.12%   98.12%   -0.01%     
==========================================
  Files         437      437              
  Lines       36749    36664      -85     
==========================================
- Hits        36061    35976      -85     
  Misses        688      688              
Flag Coverage Δ
integration 98.12% <100.00%> (-0.01%) ⬇️
latest-uploader-overall 98.12% <100.00%> (-0.01%) ⬇️
unit 98.12% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.06% <100.00%> (-0.01%) ⬇️
OutsideTasks 98.10% <100.00%> (-0.01%) ⬇️
Files Coverage Δ
helpers/pathmap.py 100.00% <100.00%> (ø)
helpers/tests/pathmap/test_pathmap.py 100.00% <100.00%> (ø)
helpers/tests/pathmap/test_tree.py 100.00% <100.00%> (ø)
services/path_fixer/__init__.py 100.00% <100.00%> (ø)

Copy link

codecov bot commented Aug 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.16%. Comparing base (df39f54) to head (2661202).
Report is 6 commits behind head on main.

✅ All tests successful. No failed tests found.

Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #637      +/-   ##
==========================================
- Coverage   98.16%   98.16%   -0.01%     
==========================================
  Files         476      476              
  Lines       38070    37985      -85     
==========================================
- Hits        37372    37287      -85     
  Misses        698      698              
Flag Coverage Δ
integration 98.12% <100.00%> (-0.01%) ⬇️
latest-uploader-overall 98.12% <100.00%> (-0.01%) ⬇️
unit 98.12% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.15% <100.00%> (-0.01%) ⬇️
OutsideTasks 98.10% <100.00%> (-0.01%) ⬇️
Files Coverage Δ
helpers/pathmap.py 100.00% <100.00%> (ø)
helpers/tests/pathmap/test_pathmap.py 100.00% <100.00%> (ø)
helpers/tests/pathmap/test_tree.py 100.00% <100.00%> (ø)
services/path_fixer/__init__.py 100.00% <100.00%> (ø)

This change has been scanned for critical changes. Learn more

Copy link

codecov-public-qa bot commented Aug 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.12%. Comparing base (df39f54) to head (2661202).

✅ All tests successful. No failed tests found.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #637      +/-   ##
==========================================
- Coverage   98.12%   98.12%   -0.01%     
==========================================
  Files         437      437              
  Lines       36749    36664      -85     
==========================================
- Hits        36061    35976      -85     
  Misses        688      688              
Flag Coverage Δ
integration 98.12% <100.00%> (-0.01%) ⬇️
latest-uploader-overall 98.12% <100.00%> (-0.01%) ⬇️
unit 98.12% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.06% <100.00%> (-0.01%) ⬇️
OutsideTasks 98.10% <100.00%> (-0.01%) ⬇️
Files Coverage Δ
helpers/pathmap.py 100.00% <100.00%> (ø)
helpers/tests/pathmap/test_pathmap.py 100.00% <100.00%> (ø)
helpers/tests/pathmap/test_tree.py 100.00% <100.00%> (ø)
services/path_fixer/__init__.py 100.00% <100.00%> (ø)

@codecov-notifications
Copy link

codecov-notifications bot commented Aug 23, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

✅ All tests successful. No failed tests found.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #637      +/-   ##
==========================================
- Coverage   98.12%   98.12%   -0.01%     
==========================================
  Files         437      437              
  Lines       36749    36664      -85     
==========================================
- Hits        36061    35976      -85     
  Misses        688      688              
Flag Coverage Δ
integration 98.12% <100.00%> (-0.01%) ⬇️
latest-uploader-overall 98.12% <100.00%> (-0.01%) ⬇️
unit 98.12% <100.00%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Components Coverage Δ
NonTestCode 96.06% <100.00%> (-0.01%) ⬇️
OutsideTasks 98.10% <100.00%> (-0.01%) ⬇️
Files Coverage Δ
helpers/pathmap.py 100.00% <100.00%> (ø)
helpers/tests/pathmap/test_pathmap.py 100.00% <100.00%> (ø)
helpers/tests/pathmap/test_tree.py 100.00% <100.00%> (ø)
services/path_fixer/__init__.py 100.00% <100.00%> (ø)

Base automatically changed from swatinem/cleanup-pathfixer to main August 27, 2024 12:25
@Swatinem Swatinem force-pushed the swatinem/rewrite-pathfixer branch from 98f6356 to c5b3b51 Compare August 27, 2024 12:45
@Swatinem Swatinem marked this pull request as ready for review August 27, 2024 12:45

def insert(self, path: str):
# the path components, in reverse order
components = reversed(path.split("/"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it'd be a good idea to add a comment explaining why we process the path in reverse as a docstring on the Tree class. It took me a while to remember why and I think it'd be very useful for devs seeing this for the first time. Something like: "this is processed in reverse because we're trying to match paths that have differing parent dirs at the start that at some point converge to matching names for the parent dirs. For ex: dir1/dir2/file1.txt and tmpdir1/tmpdir2/dir1/dir2/file1.txt.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the suggestion. I added a docstring explaining the internal structure with an example.

This mostly rewrites the `Tree`, making the following changes and optimizations:
- Uses a real `Node` struct with children and terminals, instead of abusing special keys for it.
- Avoids constructing needless non-terminal strings for all intermediate nodes.
- Constructs the tree directly iteratively, instead of creating a parallel tree and merging recursively.
- Switches from recursion to iteration for `_drill`. It should be possible to also avoid recursion in lookup, but with a bit more effort.

This should primarily improve construction performance and improve memory usage, which was the primary pain points with the previous implementation.
@Swatinem Swatinem force-pushed the swatinem/rewrite-pathfixer branch from 9287600 to 2661202 Compare August 27, 2024 14:50
@Swatinem Swatinem added this pull request to the merge queue Aug 28, 2024
Merged via the queue into main with commit a1008f1 Aug 28, 2024
21 of 26 checks passed
@Swatinem Swatinem deleted the swatinem/rewrite-pathfixer branch August 28, 2024 08:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants