Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes preserving coordinates in regrid2 #716

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

jasonb5
Copy link
Collaborator

@jasonb5 jasonb5 commented Nov 21, 2024

Description

When regrid2 constructs the output dataset it did not preserve any coords from the input dataset or the output grid. This PR fixes the issue by populating coords for every dimension from the appropriate source (input dataset, output grid).

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • My changes generate no new warnings
  • Any dependent changes have been merged and published in downstream modules

If applicable:

  • I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass with my changes (locally and CI/CD build)
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have noted that this is a breaking change for a major release (fix or feature that would cause existing functionality to not work as expected)

@github-actions github-actions bot added the type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors. label Nov 21, 2024
@jasonb5 jasonb5 requested a review from lee1043 November 21, 2024 04:59
Copy link

codecov bot commented Nov 21, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 100.00%. Comparing base (27396e5) to head (e149b9d).

Additional details and impacted files
@@            Coverage Diff            @@
##              main      #716   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files           15        15           
  Lines         1609      1621   +12     
=========================================
+ Hits          1609      1621   +12     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Copy link
Collaborator

@lee1043 lee1043 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the PR. I confirmed that the minimal code in #709 (comment) returns results as expected, with coordinates properly preserved.

@tomvothecoder
Copy link
Collaborator

I'll carry this forward while @jasonb5 is out.

@tomvothecoder tomvothecoder force-pushed the fixes-perserving-coords branch from 2ab5cc4 to e067a6c Compare December 4, 2024 23:06
- Fix `_get_bounds_ensure_dtype` to determine `bounds` with axis that has `standard_name` attr (in addition to `axis` attr check)
- Remove unused `dst_lat_bnds` and `dst_lon_bnds` args for `_build_dataset()`
- Add unit test to cover `ValueError` in `regrid2.py` `_build_dataset()`
Copy link
Collaborator

@tomvothecoder tomvothecoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @jasonb5, I refactored your logic to handle all axes that are map-able via xCDAT (using CF "axis" attr, "standard_name" attr, and valid dim name). Before it could only map to axes that have the "axis" attr set (via ds.cf.axes).

The GH Actions build still passes.

@lee1043 let's wait for Jason to review my changes when he's back.

Comment on lines 237 to 252
for dim in input_data_var.dims:
dim = str(dim)
Returns
-------
xr.Dataset
A new dataset containing the regridded data variable with updated
coordinates and attributes.
"""
dv_input = ds[data_var]

try:
axis_name = [
cf_axis for cf_axis, dims in ds.cf.axes.items() if dim in dims
][0]
except IndexError as e:
raise ValueError(
f"Could not determine axis name for dimension {dim}"
) from e

if axis_name in ["X", "Y"]:
output_coords[dim] = output_grid.cf[axis_name]
else:
output_coords[dim] = input_data_var.cf[axis_name]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your logic looped over input data variable dims and checks if the dim name is in ds.cf.axes (must have "axis" attr set).

Comment on lines +298 to +315
output_coords: Dict[str, xr.DataArray] = {}

# First get the X and Y axes from the output grid.
for key in ["X", "Y"]:
input_coord = xc.get_dim_coords(dv_input, key) # type: ignore
output_coord = xc.get_dim_coords(output_grid, key) # type: ignore

output_coords[str(input_coord.name)] = output_coord # type: ignore

# Get the remaining axes the input data variable (e.g., "time").
for dim in dv_input.dims:
if dim not in output_coords:
output_coords[str(dim)] = dv_input[dim]

# Sort the coords to align with the input data variable dims.
output_coords = {str(dim): output_coords[str(dim)] for dim in dv_input.dims}

return output_coords
Copy link
Collaborator

@tomvothecoder tomvothecoder Dec 6, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My logic gets the X and Y axes from the output_grid via xc.get_dim_coords(). This function can map to axes via "axis" attr, "standard_name" attr, and accepted dim names (e.g., lat, lon).

For remaining axes, it just gets them directly from the input data variable (dv_input) like in your logic.

Comment on lines 571 to 617
for key in cf_keys:
try:
name = ds.cf.bounds[key][0]
except (KeyError, IndexError):
pass
else:
bounds = ds[name]
try:
bounds = ds.bounds.get_bounds(axis)
except KeyError:
pass

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I simplified this logic by using ds.bounds.get_bounds() instead of ds.cf.bounds.

@tomvothecoder
Copy link
Collaborator

Hey @jasonb5, just pinging you again for review when you have time this week. I am hoping to get this fix in for v0.8.0, which I'm aiming for within the next few weeks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type: bug Inconsistencies or issues which will cause an issue or problem for users or implementors.
Projects
Status: In Progress
3 participants