Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FIX(psoct): avoid loading all slices when files are "old matlab format" #27

Merged
merged 11 commits into from
Nov 22, 2024

Conversation

balbasty
Copy link
Collaborator

The previous code was memory efficient when files where in h5 format ("new mat files"), but was loading all slices at once
when they were "old mat files". I now wrap the files in H5/MAT wrapper that maps or load the data on demand, and loop across slices one at a time (rather than reading chunks of slices as before). It's a bit less efficient when h5 files are used but much more memory efficient when old files are used. We could think of having two different fallabacks based on file version in the future, but I am not sure it's necessary.

I've also made the code a bit more general so that it does not break if the slices have a channel dimension (altough I don't have such files to test, and assume that the channel dimension is the first dimension, which is unlikely).

I've also made guesing the key a bit more robust in old mat files by skipping "private" keys (that start with an underscore).

@balbasty balbasty requested a review from calvinchai November 22, 2024 15:52
dat = omz[str(level - 1)][tuple(slicer)]

# Discard the last voxel along odd dimensions
crop = [0 if x == 1 else x % 2 for x in dat.shape[-ndim:]]
crop = [
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you please explain what is this for? Why do we need to use this full shape instead? Since this changes breaks another test from another modality.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We divide the previous resolution by 2. Thinking in 1D, to do this we reshape a dimension [N] into a dimension [N//2, 2] (essentialy you get a stack of the odd and even voxels) and then average across the new small axis.

However, we can only do this if the original dimension is even. So if it's odd I crop the last voxel (by doing something like array[:-1]).

That said, I think that the current code assumes that the data is exactly 3D (no channel dimension). I might have fixed it in the other PR (to be merged).

What error do you get, and on what kind of data?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. The error from another modality is basically dimension not matching which makes sense to me now. Could you please point me to the other PR if it is not the one we just merged?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was the one you just merged

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants