Moving away from accessors #15

nbren12 · 2022-03-02T16:40:30Z

Some limitations of accessory came up on #14.

That PR is fine, but I would like to advocate against letting accessors constrain the design of this package too much. I agree accessors can make something more discoverable in an interactive environment and may feel "cleaner", but otoh they tend to break static analysis like type checking or auto-completion in an editor. For the same reason, it requires an unused import import xpartition that gets flagged by flake8. At the same time they encourage writing classes for things that could be functions...which is a bit of a red flag in any other context. Re namespacing...that can be done with python modules, and I don't see much difference between xpartition.write(ds, ...) and ds.partition.write. Overall, I think accessors are a bit of step backwards from plain python import on several fronts, and one of the xarray features that is more optimized for the interactive environment at the expense of long-term maintainability.

The text was updated successfully, but these errors were encountered:

spencerkclark · 2022-03-03T12:24:10Z

Sure, I wasn't necessarily concerned about interoperability with type checking or text editor auto-completion when I wrote this. Mostly it just felt natural for many of these features to be attached as methods or properties to DataArrays and Datasets -- this was particularly true for the blocks accessor, which I used a fair amount in notebooks when prototyping other aspects of xpartition -- and accessors provided a way to do this. I think it's fair to argue that I got carried away when writing the partition DataArray and Dataset accessors.

The most useful aspects of that code outside xpartition* are probably ds.partition.initialize_store and ds.partition.write, which could easily be refactored to be xpartition.initialize_store and xpartition.write_dataset. The DataArray functionality is mostly needed for the internals.

Re namespacing...that can be done with python modules, and I don't see much difference between xpartition.write(ds, ...) and ds.partition.write

Of course. My point here was more in defining separate functions for DataArrays and Datasets. Defining single functions that attempt to work on both has been error prone in my experience (and sometimes some arguments are only relevant for one case or the other). To me there is a minor difference between writing da.partition.write(..) or ds.partition.write(...) versus xpartition.write_dataarray(da, ...) or xpartition.write_dataset(ds, ...), but I take your point that this difference may not be worth other limitations of using accessors.

*There is probably another discussion to be had around what (if any) low-level features of xpartition should be part of the public API.

nbren12 · 2022-03-03T17:19:13Z

My point here was more in defining separate functions for DataArrays and Datasets. Defining single functions that attempt to work on both has been error prone in my experience (and sometimes some arguments are only relevant for one case or the other).

Oh. I absolutely agree about this.

nbren12 mentioned this issue Mar 2, 2022

Add option to collect variable writes in ds.partition.write #14

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Moving away from accessors #15

Moving away from accessors #15

nbren12 commented Mar 2, 2022

spencerkclark commented Mar 3, 2022

nbren12 commented Mar 3, 2022

Moving away from accessors #15

Moving away from accessors #15

Comments

nbren12 commented Mar 2, 2022

spencerkclark commented Mar 3, 2022

nbren12 commented Mar 3, 2022