Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Repack Nwb Files #1003

Draft
wants to merge 40 commits into
base: main
Choose a base branch
from
Draft

Repack Nwb Files #1003

wants to merge 40 commits into from

Conversation

pauladkisson
Copy link
Member

@pauladkisson pauladkisson commented Aug 13, 2024

Fixes #892

Depends on hdmf-dev/hdmf#1172

Depends on hdmf-dev/hdmf-zarr#215

temp_test.py Outdated Show resolved Hide resolved
@pauladkisson
Copy link
Member Author

What are you thinking about how the API should look like for how to use this repack helper?

@CodyCBakerPhD, the code now reflects my vision for the API: repack_nwbfile takes an on-disk nwbfile and export path, configures the backend, and exports the nwbfile. Users can optionally specify the template (existing or default) and any manual changes to the backend config.

The code then progresses along 2 paths:

  • existing: where backend info is read directly from the nwbfile
  • default: where default backend info is obtained for each neurodata object in the nwbfile

lmk what you think!

print(nwbfile.acquisition["my_video"])

backend_config = get_default_backend_configuration(nwbfile, "hdf5")
print(backend_config) # TODO: Figure out why this doesn't throw an error like Ben said it did
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add an integration test where we write a new NWB file with a different backend configuration?

Copy link
Member

@CodyCBakerPhD CodyCBakerPhD Aug 15, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm... You mean something like 'make a Zarr-backend copy of this existing HDF5-backend NWB file'? Or same type of backend (such as both HDF5) but different configuration?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good question. Both, I guess. I just want to make sure this works end-to-end and does not require us to do anything funky in practice

@@ -262,3 +263,22 @@ def test_complex_zarr(zarr_nwbfile_path):

"""
assert stdout.getvalue() == expected_print


def test_000_ImageSeries():
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to see many more tests here; starting with more basic types like TimeSeries, DynamicTable, etc. before working our way up to the edge case that is an ImageSeries (external vs. internal mode too on that)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Keep in mind the intended point of this function; to start with objects that are either uncompressed, unchuncked, or badly chunked; and then we create a new copy of the file that has better compression and chunking on all applicable datasets

Copy link

codecov bot commented Aug 15, 2024

Codecov Report

Attention: Patch coverage is 23.23232% with 76 lines in your changes missing coverage. Please review.

Project coverage is 90.37%. Comparing base (40d786a) to head (934bb3a).
Report is 2 commits behind head on main.

Files Patch % Lines
...roconv/tools/nwb_helpers/_dataset_configuration.py 3.12% 31 Missing ⚠️
...nv/tools/nwb_helpers/_metadata_and_file_helpers.py 11.53% 23 Missing ⚠️
..._helpers/_configuration_models/_hdf5_dataset_io.py 38.88% 11 Missing ⚠️
...roconv/tools/nwb_helpers/_backend_configuration.py 27.27% 8 Missing ⚠️
...nwb_helpers/_configuration_models/_base_backend.py 57.14% 3 Missing ⚠️
Additional details and impacted files

Impacted file tree graph

@@            Coverage Diff             @@
##             main    #1003      +/-   ##
==========================================
- Coverage   91.25%   90.37%   -0.88%     
==========================================
  Files         127      127              
  Lines        7555     7645      +90     
==========================================
+ Hits         6894     6909      +15     
- Misses        661      736      +75     
Flag Coverage Δ
unittests 90.37% <23.23%> (-0.88%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Files Coverage Δ
src/neuroconv/tools/nwb_helpers/__init__.py 100.00% <ø> (ø)
..._helpers/_configuration_models/_base_dataset_io.py 93.38% <100.00%> (+0.11%) ⬆️
...nwb_helpers/_configuration_models/_base_backend.py 95.08% <57.14%> (-4.92%) ⬇️
...roconv/tools/nwb_helpers/_backend_configuration.py 55.55% <27.27%> (-44.45%) ⬇️
..._helpers/_configuration_models/_hdf5_dataset_io.py 57.69% <38.88%> (-10.88%) ⬇️
...nv/tools/nwb_helpers/_metadata_and_file_helpers.py 75.42% <11.53%> (-11.24%) ⬇️
...roconv/tools/nwb_helpers/_dataset_configuration.py 67.88% <3.12%> (-25.62%) ⬇️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Feature]: repack NWB file
3 participants