Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add vector type support #330

Merged
merged 6 commits into from
Aug 6, 2024
Merged

Add vector type support #330

merged 6 commits into from
Aug 6, 2024

Conversation

fotonick
Copy link
Contributor

@fotonick fotonick commented Jul 20, 2024

Add vector type support (ex: '100E') and fix bit handling

  • In the FITS spec, columns can be of vector type, where each row has multiple of the same data type. Bitmasks (32X) and strings (20A) are the most common examples, but others exist in the wild. My example is from Planck Legacy Archives → Maps → CMB Maps → Uncheck "Only legacy products" → SEVEM → Click to download the single row (full mission); this FITS file contains one row, which is TFORM1 of 201326592E, which happens to be a HEALPix map with NSIDE of 4096.
  • In more fully supporting repeats for vectors, revamped the bit handling.
  • Filled out the data type support and added round-trip read-write tests for all vector types. Weirdly, I couldn't get the u32 case to work and suspect a CFITSIO bug. I left the u32 test commented out for now with a more detailed explanation. Would love someone to reproduce and investigate. Then again, all unsigned integral types beyond u8 are unofficial CFITSIO extensions beyond the FITS spec, so maybe it's OK to leave out.
  • Removed conditionals for 32-bit and 64-bit reads_col_impl and writes_col_impl; int is 32 bits and long long is consistently 64 bits on all machines and platforms since ~2000. Only long is weird and special.

Contribution checklist:

  • cargo test passes on my M1 Mac and on an x86_64 Linux machine using the provided Docker environment. I'm afraid I don't have any 32-bit or Windows environments available for testing.
  • rustfmt changes nothing.
  • Added to changelog.
  • I have not updated full_example.rs, as test_vector_datatypes.rs shows how to use vector types, but I am happy to do so upon request.

@fotonick
Copy link
Contributor Author

Here's the relevant CI failure:

  = note: /target/armv7-unknown-linux-gnueabihf/debug/deps/libfitsio-54f105943388d8d9.rlib(fitsio-54f105943388d8d9.r2uf0l329hl0wxb.rcgu.o): In function `fitsio::longnam::fits_read_col_ulnglng::h1cac66cf15f2e9c2':
          /project/fitsio/src/longnam.rs:337: undefined reference to `ffgcvujj'

This occurs in the linking phase. ffgcvujj (a.k.a. fits_read_col_ulnglng) lives in getcoluj.c. While it is suspicious that 64-bit code is failing on the 32-bit platform, I'm really stumped. I don't see any sort of conditional compilation of this function, nor of anything in Makefile.in that would skip the compilation or would avoid linking it. Nothing conditional in bindings_32.rs. The CFITSIO manual indicates that they went out of their way to make 64-bit integers work everywhere.

@simonrw any ideas?

@simonrw
Copy link
Owner

simonrw commented Jul 20, 2024

Hey thanks very much for this addition! I haven't had a chance to look in detail but I will try to investigate as soon as I can.

@coveralls
Copy link

Coverage Status

coverage: 76.86% (+0.06%) from 76.797%
when pulling 350b822 on fotonick:main
into 32aeadd on simonrw:main.

@simonrw simonrw self-requested a review July 26, 2024 15:51
@simonrw
Copy link
Owner

simonrw commented Jul 26, 2024

Is the 32/64 bit conditionals really required for this PR? Perhaps we could split out those into a separate PR since they are causing the build to break. We can then take the two conceptual changes separately?

Copy link
Owner

@simonrw simonrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I don't have much time to investigate myself at the moment why the CI is failing, but I suspect it is related to the removal of the conditional compilation for integer types. I don't fully see how they are related to this PR so it would be good to remove this change.

I also have a couple of more significant comments that I'd like addressing before I continue to review the rest if that's ok?

Thanks again for submitting this clearly very competent PR to this crate, I hope that you are finding the crate useful, and am keen to hear more from users!

CONTRIBUTING.md Show resolved Hide resolved
fitsio/src/hdu.rs Outdated Show resolved Hide resolved
@codecov-commenter
Copy link

codecov-commenter commented Jul 27, 2024

Codecov Report

Attention: Patch coverage is 87.87879% with 8 lines in your changes missing coverage. Please review.

Project coverage is 89.2%. Comparing base (0d37302) to head (f1880a4).
Report is 14 commits behind head on main.

Files with missing lines Patch % Lines
fitsio/src/tables.rs 74.1% 8 Missing ⚠️
Additional details and impacted files
Files with missing lines Coverage Δ
fitsio/src/fitsfile.rs 94.9% <100.0%> (+<0.1%) ⬆️
fitsio/src/longnam.rs 88.2% <100.0%> (+3.3%) ⬆️
fitsio/src/types.rs 100.0% <ø> (ø)
fitsio/src/tables.rs 85.1% <74.1%> (+0.3%) ⬆️

@fotonick
Copy link
Contributor Author

fotonick commented Jul 27, 2024

Revised the patches:

  1. Addressed direct review comments
  2. Removed bit-field revamp, as it conflicted in a fundamental way with your recent support of logicals. The FITS format has both Logicals (T/F) and Bits. Which one should map to bool is up for debate, so I left the bool impls as you recently added them—Logicals. (Oh yeah, I renamed Bool to Logical, because I think that's what's in the spec; it's at least all over the CFITSIO documentation.)

And I figured out something fascinating about the test failure—the function ffgcvujj (a.k.a. fits_read_col_ulnglng) was always missing on armv7-unknown-linux-gnueabihf! You just had no tests that called the function and I guess dead-code elimination removed the call, so the linker never complained. For now, I've chosen to only generate the test for 64-bit systems. Again, all the unsigned types are CFITSIO-specific extensions beyond the FITS spec. LMK if you want to handle u64 differently than I have.

Now performing my integer simplification (removing conditionals, referring only to int and longlong and avoiding long altogether) gives all green lights across all platforms, as I had expected. This is now in a separate commit.

Thanks for your good review. IMO, you stopped at the correct depth. Please let me know if there's anything else to address.

@fotonick
Copy link
Contributor Author

fotonick commented Aug 3, 2024

At the one week mark, I send the gentlest of pings. I believe I've addressed the previous concerns and this PR is much tighter than before. Hopefully quick to review and merge, whenever you next review PRs.

Copy link
Owner

@simonrw simonrw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay in reviews, and thanks for the "gentlest of pings"! I think this is a very nice way of reminding a maintainer of a wanted review!

I only have a couple of minor points, but in general I trust that this implementation works correctly. You have created a comprehensive read/write round-trip test suite for different data types, and the implementation seems straightforward.

In the future, it might be nice to create our own wrapper type around returning these flat arrays that allows for 2d access to the resulting data, or we commit to using ndarray under the hood. But I think it's ok to ask the user to manage fetching the correct data element on their side for now.

Speaking of ndarray: a nice follow up would be to ensure that this implementation works with the ndarray feature.

Since these suggestions are so minor, I will push some commits myself to clean up the changes if that's ok.

Thanks again for this contribution, and I'll try to get a release out for you ASAP

CHANGELOG.md Outdated Show resolved Hide resolved
fitsio/tests/test_bit_datatype.rs Show resolved Hide resolved
fitsio/tests/test_vector_datatypes.rs Show resolved Hide resolved
fitsio/tests/test_vector_datatypes.rs Show resolved Hide resolved
@fotonick
Copy link
Contributor Author

fotonick commented Aug 5, 2024

Thanks for the collaboration. I'm happy for you to do the final commits with the fixes you outlined above.

I definitely opted consciously to stay 1D for now, figuring that one can easily hand the buffer to ndarray and reshape it. I would strongly advise against any multi-dimensional abstraction, lest you accidentally trip and reimplement ndarray. The complexity is neatly separable. But indeed, I didn't test for compatibility with the ndarray feature.

And if I'm honest, I am not anticipating further contributions. I actually did this patch as part of an art project—I'm mapping CMB anisotropies (don't forget the dipole subtraction) to a triangulated surface mesh for 3D printing. Anyway, thanks for providing most of the FITS support that I needed and being receptive to my small augmentation.

@simonrw simonrw merged commit c9bece4 into simonrw:main Aug 6, 2024
11 checks passed
@simonrw
Copy link
Owner

simonrw commented Aug 6, 2024

Thanks again for adding this feature, and for being patient with my review.

I definitely opted consciously to stay 1D for now, figuring that one can easily hand the buffer to ndarray and reshape it.

👍

I would strongly advise against any multi-dimensional abstraction, lest you accidentally trip and reimplement ndarray. The complexity is neatly separable.

I think it's only 2D thats required for reading a repeat col, so the feature would be quite constrained, but yes I will wait for someone to ask for this feature.

And if I'm honest, I am not anticipating further contributions.

Personally that's a shame. This implementation is clearly competant and I am crying out for users/collaborators! I am no longer an astronomer so I can't really drive this project well enough.

I actually did this patch as part of an art project—I'm mapping CMB anisotropies (don't forget the dipole subtraction) to a triangulated surface mesh for 3D printing.

Wow sounds really cool!

simonrw added a commit that referenced this pull request Aug 6, 2024
## 🤖 New release
* `fitsio`: 0.21.4 -> 0.21.5

<details><summary><i><b>Changelog</b></i></summary><p>

## `fitsio`
<blockquote>

##
[0.21.5](fitsio-v0.21.4...fitsio-v0.21.5)
- 2024-08-06

### Added
- add vector type support
([#330](#330))

### Other
- *(deps)* update ndarray requirement from 0.15.0 to 0.16.0 in /fitsio
([#347](#347))
</blockquote>


</p></details>

---
This PR was generated with
[release-plz](https://github.com/MarcoIeni/release-plz/).
@fotonick
Copy link
Contributor Author

fotonick commented Aug 6, 2024

Thanks for the kind words. I left an astrophysics postdoc in 2012 and stayed rather less involved than you seem to have! You're doing great work here. Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants