Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: add cosmology to CBC result #867

Merged
merged 14 commits into from
Jan 23, 2025

Conversation

mj-will
Copy link
Collaborator

@mj-will mj-will commented Nov 27, 2024

Add cosmology to the CompactBinaryCoalescenceResult.

If we run analyses with a different cosmology, I think this should be recorded in the result file. My proposal is to add the cosmology in a new dictionary within meta_data called global_meta_data. This is inline with the changes proposed in #873. I've implemented it following the same pattern used for other properties, such as the waveform arguments.

Whilst making these changes, I realised that for this to work, the encoding/decoding on astropy objects wasn't working correctly for hdf5 files. So a lot of changes are related to this.

Changes

  • Add cosmology to the meta data dictionary when calling run sampler
  • Add cosmology to the meta data dictionary when making an instance of CompactBinaryCoalescenceResult
  • Improve encode/decode_astropy_cosmology/quantity to make use of built-in astropy functions and to more closely match the JsonCustomEncode implemented in astropy. The decode functions should be backwards compatible.
  • Add encode/decode_astropy_unit
  • Add decode_hdf5_dict to ensure astropy types are decoded
  • Add tests to ensure the result objects with cosmology can be saved to all available formats
  • Improved/added doc-strings for various methods

@ColmTalbot
Copy link
Collaborator

I don't think this is enough to actually record the cosmology in the file, the simplest way would be to add it to the metadata and then access it like the other properties.

@mj-will
Copy link
Collaborator Author

mj-will commented Nov 29, 2024

I've updated the PR to added support for saving astropy cosmologies to our HDF5 files.

Whilst doing this, also updated the code to use to_format/from_format for encoding and encoding the cosmologies sine it's the recommended way https://docs.astropy.org/en/stable/cosmology/io/builtin.html#cosmology-io-builtin

@@ -251,6 +251,14 @@ def run_sampler(
meta_data["likelihood"] = likelihood.meta_data
meta_data["loaded_modules"] = loaded_modules_dict()
meta_data["environment_packages"] = env_package_list(as_dataframe=True)
# Add the cosmology if available
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather not add this here, I think it sets a bad precedent of adding application-specific logic into core. Is there some reason adding this to the CBCResult is not sufficient? I think "users didn't specify the result type" is not a sufficient reason here.

Copy link
Collaborator Author

@mj-will mj-will Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that 'users didn't specify the result type' shouldn't be reason enough in principle, but I'm slightly concerned the documentation isn't sufficiently clear on this (AFAICT, only two of all our GW examples specify it). Therefore, anyone that's used one of the examples that doesn't have it as a starting point may not have specified the class. It's also different to e.g. the likelihood metadata which is always saved irrespective of the results class, meaning it can be recovered even if the wrong result class is used.

I do see your point, as this is clearly GW code. This may be out-of-scope for this MR, but I could imagine having an object that keeps track of some of the global variables, e.g. rng? This could then be saved in the metadata. If we had something like this, calling set_cosmology could just add the cosmology to that object?

Either way, I think we should update the GW examples to include the correct result class.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with all of this, especially updating examples (and docs). I think some kind of generic metadata object in the background is a good idea, adding to this in set_cosmology also sounds great. Another thing we can think about is setting a default cosmology via environment variable, I think astropy does similar, we could even piggyback off theirs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I've removed these lines and had a go at implementing a global meta data object in: #873. I've also opened an issue to track updates to the examples: #873

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've also moved cosmology to the global_meta_data so it is consistent with #873

@mj-will mj-will force-pushed the add-cosmology-to-result branch from 89e7779 to 96714cd Compare December 10, 2024 15:16
@mj-will mj-will force-pushed the add-cosmology-to-result branch from 96714cd to 62a6d06 Compare January 7, 2025 11:39
@@ -264,6 +352,13 @@ def encode_for_hdf5(key, item):
"""
from ..prior.dict import PriorDict

try:
from astropy import cosmology as cosmo, units
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not wild about having this import here, I think it could be avoided, but since it is only a debug message I'm happy to let it slide for a while.

unrelated: We could consider changing this function to use multiple dispatch at some point down the road, that would avoid ugly import tests and also allow downstream users to add their own encoding/decoding.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I agree. At the time, I couldn't come up with a clean alternative.

On multiple dispatch: that's an interesting option, might be something to bring up on a dev call.

@mj-will
Copy link
Collaborator Author

mj-will commented Jan 7, 2025

Following from the discussion on the last dev call, I checked some existing files.

For GWTC-3 and GWTC-4 files it seems to work:

>>> import bilby
>>> bilby.gw.result.CBCResult.from_json("/home/pe.o3/o3a_final/run_directories/S190706a/ProdF9/result/ProdF9_data0_1246487219-346191_analysis_H1L1V1_dynesty_merge_result.json")
<bilby.gw.result.CompactBinaryCoalescenceResult object at 0x7fb556a93ed0>
>>> bilby.gw.result.CBCResult.from_hdf5("/home/pe.o4/GWTC4/working/S230608as/bilby-IMRPhenomXPHM-SpinTaylor/final_result/bilby-IMRPhenomXPHM-SpinTaylor_data0_1370292665-161092_analysis_H1L1_merge_result.hdf5")
<bilby.gw.result.CompactBinaryCoalescenceResult object at 0x7fb5324a7f10>

However, for a bilby GWTC-1 file, it's current breaking but works with 2.4.0. I'll see if I can figure out why and fix it.

@mj-will
Copy link
Collaborator Author

mj-will commented Jan 9, 2025

Following from the discussion on the last dev call, I checked some existing files.

For GWTC-3 and GWTC-4 files it seems to work:

>>> import bilby
>>> bilby.gw.result.CBCResult.from_json("/home/pe.o3/o3a_final/run_directories/S190706a/ProdF9/result/ProdF9_data0_1246487219-346191_analysis_H1L1V1_dynesty_merge_result.json")
<bilby.gw.result.CompactBinaryCoalescenceResult object at 0x7fb556a93ed0>
>>> bilby.gw.result.CBCResult.from_hdf5("/home/pe.o4/GWTC4/working/S230608as/bilby-IMRPhenomXPHM-SpinTaylor/final_result/bilby-IMRPhenomXPHM-SpinTaylor_data0_1370292665-161092_analysis_H1L1_merge_result.hdf5")
<bilby.gw.result.CompactBinaryCoalescenceResult object at 0x7fb5324a7f10>

However, for a bilby GWTC-1 file, it's current breaking but works with 2.4.0. I'll see if I can figure out why and fix it.

I've fixed this is my last commit. The issue was that the __cosmology__ entry had already been deleted before falling back to the legacy decoding causing the del there to fail.

@mj-will mj-will added this to the 2.5.0 milestone Jan 9, 2025
@mj-will mj-will force-pushed the add-cosmology-to-result branch from 26efc7e to b676477 Compare January 10, 2025 09:20
@mj-will mj-will requested a review from GregoryAshton January 10, 2025 11:20
@GregoryAshton
Copy link
Collaborator

@mj-will overall this looks like a good. But, it does add a lot of overhead and tying into astropy. Is there any risk that we will end up unable to read/write results because of an astropy version change? E.g., if an astropy cosmology gets added/changed is this robust?

@mj-will
Copy link
Collaborator Author

mj-will commented Jan 13, 2025

@mj-will overall this looks like a good. But, it does add a lot of overhead and tying into astropy. Is there any risk that we will end up unable to read/write results because of an astropy version change? E.g., if an astropy cosmology gets added/changed is this robust?

@GregoryAshton I agree this does increase our dependence on astropy but, to me, this makes us more robust to astropy changes since we're using their own API to read/write objects rather than relying on introspection. The to_format/from_format methods have been around since v5 so I don't expect they will have large changes without a fair bit of warning. We could however implement a fallback for if the encoding fails, though I'm not quite sure what the best option would be (I'm not sure what exception one could catch). Do you have any thoughts?

As for overhead, other than the initial import, I'd hope most of the operations are pretty lightweight. Is there a particular call/operation you know to be slow?

@GregoryAshton
Copy link
Collaborator

@GregoryAshton I agree this does increase our dependence on astropy but, to me, this makes us more robust to astropy changes since we're using their own API to read/write objects rather than relying on introspection. The to_format/from_format methods have been around since v5 so I don't expect they will have large changes without a fair bit of warning. We could however implement a fallback for if the encoding fails, though I'm not quite sure what the best option would be (I'm not sure what exception one could catch). Do you have any thoughts?

Okay - in that case it sounds like you have thought through the issues. For the fallback, I guess one could just store the name. However, usually that is a bad idea as we then loose information silently, so I am happy to agree we don't need a fallback.

As for overhead, other than the initial import, I'd hope most of the operations are pretty lightweight. Is there a particular call/operation you know to be slow?

Nothing in particular, just curious.

@mj-will
Copy link
Collaborator Author

mj-will commented Jan 22, 2025

Thanks, @GregoryAshton, given that are you happy to approve this?

@ColmTalbot ColmTalbot force-pushed the add-cosmology-to-result branch from b676477 to 1a274bb Compare January 23, 2025 16:17
@mj-will mj-will merged commit e7d2095 into bilby-dev:main Jan 23, 2025
12 checks passed
@mj-will mj-will deleted the add-cosmology-to-result branch January 23, 2025 17:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants