FIX(psoct): avoid loading all slices when files are "old matlab format" #27

balbasty · 2024-11-22T15:48:07Z

The previous code was memory efficient when files where in h5 format ("new mat files"), but was loading all slices at once
when they were "old mat files". I now wrap the files in H5/MAT wrapper that maps or load the data on demand, and loop across slices one at a time (rather than reading chunks of slices as before). It's a bit less efficient when h5 files are used but much more memory efficient when old files are used. We could think of having two different fallabacks based on file version in the future, but I am not sure it's necessary.

I've also made the code a bit more general so that it does not break if the slices have a channel dimension (altough I don't have such files to test, and assume that the channel dimension is the first dimension, which is unlikely).

I've also made guesing the key a bit more robust in old mat files by skipping "private" keys (that start with an underscore).

…nto oct_mat_to_zarr

…amid): do not crash if last chunk in a row only has a single voxel

…nto oct_mat_to_zarr

calvinchai · 2024-11-25T17:04:06Z

linc_convert/modalities/psoct/_utils.py

            dat = omz[str(level - 1)][tuple(slicer)]

            # Discard the last voxel along odd dimensions
-            crop = [0 if x == 1 else x % 2 for x in dat.shape[-ndim:]]
+            crop = [


Could you please explain what is this for? Why do we need to use this full shape instead? Since this changes breaks another test from another modality.

We divide the previous resolution by 2. Thinking in 1D, to do this we reshape a dimension [N] into a dimension [N//2, 2] (essentialy you get a stack of the odd and even voxels) and then average across the new small axis.

However, we can only do this if the original dimension is even. So if it's odd I crop the last voxel (by doing something like array[:-1]).

That said, I think that the current code assumes that the data is exactly 3D (no channel dimension). I might have fixed it in the other PR (to be merged).

What error do you get, and on what kind of data?

I see. The error from another modality is basically dimension not matching which makes sense to me now. Could you please point me to the other PR if it is not the one we just merged?

It was the one you just merged

balbasty and others added 4 commits November 22, 2024 15:42

FIX/ENH(psoct): do not load all slices at once (if old mat files)

fb08a7d

Merge branch 'oct_mat_to_zarr' of github.com:lincbrain/linc-convert i…

822912e

…nto oct_mat_to_zarr

FIX(psoct): hints to please ruff

4f71ddd

style fixes by ruff

a2735f7

balbasty requested a review from calvinchai November 22, 2024 15:52

calvinchai approved these changes Nov 22, 2024

View reviewed changes

balbasty and others added 7 commits November 22, 2024 16:06

ENH(psoct.single_volume): more robust default key

2e3d12c

DOC(psoct)

3dc2636

Merge branch 'oct_mat_to_zarr' of github.com:lincbrain/linc-convert i…

7ecfc08

…nto oct_mat_to_zarr

style fixes by ruff

3611911

FIX(psoct): propagate no_pool to pyramid generator + FIX(generate_pyr…

41cbd1f

…amid): do not crash if last chunk in a row only has a single voxel

Merge branch 'oct_mat_to_zarr' of github.com:lincbrain/linc-convert i…

ca20adb

…nto oct_mat_to_zarr

style fixes by ruff

204bcb2

balbasty merged commit 158531b into main Nov 22, 2024

balbasty mentioned this pull request Nov 22, 2024

psoct.multi_slice: do not require a channel dimension #26

Closed

calvinchai reviewed Nov 25, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FIX(psoct): avoid loading all slices when files are "old matlab format" #27

FIX(psoct): avoid loading all slices when files are "old matlab format" #27

balbasty commented Nov 22, 2024

calvinchai Nov 25, 2024

balbasty Nov 26, 2024

calvinchai Nov 26, 2024

balbasty Nov 26, 2024

FIX(psoct): avoid loading all slices when files are "old matlab format" #27

FIX(psoct): avoid loading all slices when files are "old matlab format" #27

Conversation

balbasty commented Nov 22, 2024

calvinchai Nov 25, 2024

Choose a reason for hiding this comment

balbasty Nov 26, 2024

Choose a reason for hiding this comment

calvinchai Nov 26, 2024

Choose a reason for hiding this comment

balbasty Nov 26, 2024

Choose a reason for hiding this comment