Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use removeprefix rather than replace to avoid separator deletion #2778

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

moradology
Copy link
Contributor

@moradology moradology commented Jan 30, 2025

The point of replace here is just to remove the prefix instance but it has the unintended consequence of deleting separators when path is the empty string. This uses removeprefix explicitly and avoids that potential pitfall

Fixes #2777

TODO:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/user-guide/*.rst
  • Changes documented as a new file in changes/
  • GitHub Actions have all passed
  • Test coverage is 100% (Codecov passes)

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jan 30, 2025
@moradology moradology force-pushed the fix/avoid-deleting-separators branch from 22c3a69 to 90812f5 Compare January 30, 2025 02:28
Copy link
Contributor

@dstansby dstansby left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thansk a lot, code change looks 👍

Could you add a test using an example that would have failed before this change to make sure this doesn't fail in the future?

@d-v-b
Copy link
Contributor

d-v-b commented Jan 30, 2025

Thansk a lot, code change looks 👍

Could you add a test using an example that would have failed before this change to make sure this doesn't fail in the future?

seconding this request -- the real bug is the hole in our test suite

@moradology
Copy link
Contributor Author

moradology commented Jan 30, 2025

Test added. Seems like a fairly general behavior but it wasn't obvious that there was a more abstract way to add it, so I opted for covering just the fsspec store path

EDIT: This comment is braindead. Pushing things up to the generic store tests as described below

@d-v-b
Copy link
Contributor

d-v-b commented Jan 30, 2025

those tests look good. if it's not too much work, you could consider adding them to the store test base class (the relevant lines are here:

async def test_list(self, store: S) -> None:
assert await _collect_aiterator(store.list()) == ()
prefix = "foo"
data = self.buffer_cls.from_bytes(b"")
store_dict = {
prefix + "/zarr.json": data,
**{prefix + f"/c/{idx}": data for idx in range(10)},
}
await store._set_many(store_dict.items())
expected_sorted = sorted(store_dict.keys())
observed = await _collect_aiterator(store.list())
observed_sorted = sorted(observed)
assert observed_sorted == expected_sorted
async def test_list_prefix(self, store: S) -> None:
"""
Test that the `list_prefix` method works as intended. Given a prefix, it should return
all the keys in storage that start with this prefix.
"""
prefixes = ("", "a/", "a/b/", "a/b/c/")
data = self.buffer_cls.from_bytes(b"")
fname = "zarr.json"
store_dict = {p + fname: data for p in prefixes}
await store._set_many(store_dict.items())
for prefix in prefixes:
observed = tuple(sorted(await _collect_aiterator(store.list_prefix(prefix))))
expected: tuple[str, ...] = ()
for key in store_dict:
if key.startswith(prefix):
expected += (key,)
expected = tuple(sorted(expected))
assert observed == expected
async def test_list_dir(self, store: S) -> None:
root = "foo"
store_dict = {
root + "/zarr.json": self.buffer_cls.from_bytes(b"bar"),
root + "/c/1": self.buffer_cls.from_bytes(b"\x01"),
}
assert await _collect_aiterator(store.list_dir("")) == ()
assert await _collect_aiterator(store.list_dir(root)) == ()
await store._set_many(store_dict.items())
keys_observed = await _collect_aiterator(store.list_dir(root))
keys_expected = {k.removeprefix(root + "/").split("/")[0] for k in store_dict}
assert sorted(keys_observed) == sorted(keys_expected)
keys_observed = await _collect_aiterator(store.list_dir(root + "/"))
assert sorted(keys_expected) == sorted(keys_observed)
) -- maybe we catch some more bugs this way. But that's extra credit. If you don't change the base class, I can just open an issue indicating that we need to include these fsspec tests in the base class.

@moradology moradology force-pushed the fix/avoid-deleting-separators branch from d0134ce to 45447af Compare January 30, 2025 16:08
@moradology
Copy link
Contributor Author

Good call. This was obviously the right place to test things

@moradology moradology requested a review from dstansby January 30, 2025 16:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs release notes Automatically applied to PRs which haven't added release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potentially unexpected behavior on FsspecStore list method
3 participants