Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[do not merge] sketch of a refactor for core modules #2780

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

d-v-b
Copy link
Contributor

@d-v-b d-v-b commented Jan 30, 2025

while working on #2665, I had the following experience in the group.py module:

  • I write a collection of async functions that together support some useful public functionality.
  • I want to expose some (but not all) of these async functions as public synchronous functions.
  • Because the async functions and the sync functions are tightly coupled (the sync version is a light wrapper around the async version), I want them to be close together in code space, so that the relationship between the two is obvious.
  • I do not want the async and sync functions to have different names. So they need to be in different namespaces.

We already have something like this in our api module, which is split into asynchronous and synchronous modules, where the latter imports from the former. I am proposing to use the same logical arrangement inside the group and array modules.

This PR is a rough outline of what that would look like. tests will not pass. My request is that you look at how the code in the new group module is arranged, and weigh in on whether you think this direction makes sense.

@github-actions github-actions bot added the needs release notes Automatically applied to PRs which haven't added release notes label Jan 30, 2025
@d-v-b
Copy link
Contributor Author

d-v-b commented Jan 30, 2025

in this PR, i split group into core, sync, and metadata modules.

  • metadata contains the group metadata classes.
  • core contains async code and necessary utilities. The main export here is the AsyncGroup class, but also functions that do useful things to AsyncGroup instances.
  • sync contains blocking versions of a subset of the async stuff in core, including the Group class definition.

I also sketched out what it might look like if we partitioned array.py into sub-modules (core, sync, indexing, etc).

This layout would allow me to define a useful async function called do_something_with_group in group.core, then import that function to group.sync, and wrap it inside group.sync.do_something_with_group, e.g.

# group/sync.py
# looks dumb but it's explicit
import zarr.core.group.core as core
from zarr.core.sync import sync
def do_something_with_group():
  return sync(core.do_something_with_group())

Instead of defining these synchronous wrappers of async functions in api.synchronous, we would define them in group.sync or array.sync; api.synchronous would just re-export these routines.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
needs release notes Automatically applied to PRs which haven't added release notes
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant