Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[WORK-IN-PROGRESS] Introduce the path walk API into Git for Windows #5146
[WORK-IN-PROGRESS] Introduce the path walk API into Git for Windows #5146
Changes from 1 commit
6354d7a
c8e08c3
b05a276
e02f7b3
4236e4f
31c9b45
356abc9
3a421ff
b9471b6
c4b3490
ca37a49
0a20a17
91c4d57
53632be
c63928e
3e9b671
af7d53f
d192ae7
5f7e131
ab0bc08
bd8b5b5
c6d4832
c2092f0
bbc57f7
32fca07
c145b9e
72191a0
5039f03
e43582c
88fee5b
d17e503
d7e7283
98a5786
9d0690a
556335a
69aa8d8
5001883
3ab1bda
84c8a06
16cd9a3
c8f1239
b5c2265
fee8f88
489ce0c
9b78d40
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For performance reasons, it might make sense to pass in
base_path
as astrbuf
, so that only one scratch buffer is used for the entirety of the path walk (at least in single-threaded mode). But we can do that later.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We might want to add this, out of an abundance of caution:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Continuing on errors may be the right thing to do always. On the off-chance that a user may want to return early on error, we may want to add a flag for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder whether this is safe. If the same
path
appears multiple times on the stack, e.g. if two merged branches separately removed the same file, wouldn't we then remove thepaths_to_lists
item prematurely? This might be intentional, though, as we stop once we're hitting a blob that we'd already seen. Therefore in a history like this:where
file.txt
was present in A, deleted in B and D, and missing from C, we would probably hit the file in B first, walk its history all the way until the commit that added it, then remove the item frompaths_to_lists
, then re-add it when encountering it in D, see that we handled it in A already, and re-remove it. That might be what we need here, but I have not been able to wrap my head around this enough yet to be confident.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is important to realize that a path appears at most once. The walk goes like this:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, the crucial part I was missing is that we first walk all commits, build up the huge, giant list for "" (i.e. all the top-level trees). Then we iterate over all of those before going any further. Crucially, no single tree walk happens until every last of the commits has been walked.
For some reason, I got the impression that we're doing a depth-first search where we immediately iterate into the full tree of the first commit, until we exhausted all of the blobs reachable from that root tree, and only then walk to the next commit. But that's not what we do ;-)
We basically walk all the paths in a depth-first manner, but not in alphabetical order because we end up plucking off the paths from the top of the stack (i.e. the "alphabetically last" item seen so far, but that only holds true if the root tree has no child trees, otherwise the order gets messy real quickly). So the order is somewhat jumbled, I guess ;-)
Thank you for explaining this so patiently. I think my confusion might be a good motivation to add a little primer about the order to the commit message of the "path-walk: introduce an object walk by path" commit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was working on a technical doc for the API but don't have it committed or pushed anywhere. This confirms that it is necessary.