Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement STAC and OGC API compatibility improvements #905

Merged
merged 51 commits into from
Dec 7, 2023

Conversation

pont-us
Copy link
Member

@pont-us pont-us commented Oct 23, 2023

Closes #904 .

  • Expand the STAC catalogue coverage links to provide links for multiple formats.
  • For TIFF and PNG OGC coverages, apply the squeeze method to remove any singleton dimensions.
  • Add support for the scale-factor parameter to the main coverage endpoint.
  • Improve handling of bbox parameters: they are now applied to the axes in the order that they appear in the dataset, as required in the specification. Inverted y-axes are also handled.
  • Handle half-open datetime intervals.
  • More parameter checking: various invalid parameters will now result in a 400 response rather than a server-side error.
  • Produce a 501 response for parameters which are not yet implemented.
  • Add media types to coverage links.
  • Return a 400 Bad Request response for attempts at subsetting on non-existent axes.
  • Handle quotation marks in coverages subset parameter, if present.
  • Use crs_wkt when determining CRS, if present and needed.
  • Give a 400 Bad Request response on requests for non-existent properties (i.e. variables).
  • Change default subsetting and bbox CRS from EPSG:4326 to OGC:CRS84.
  • Implement reprojection for bbox.
  • Run datetime parameter values through ensure_time_index_compatible function to fix timezone-aware / -naive incompatibilities
  • If writing an image from a Dataset with a single variable, write the relevant DataArray rather than the whole Dataset (avoids errors due to additional "bnds" dimensions).
  • Return the appropriate HTTP responses for requests which result in no data or too much data.
  • Add crs and crs_storage properties to STAC data.
  • Reimplement subsetting to conform more closely to the standard and to separate request parsing more cleanly from data operations.
  • Set the Content-Bbox and Content-Crs headers when returning a coverage.
  • Support safe CURIE syntax for CRS specification.

Checklist:

  • Add unit tests and/or doctests in docstrings
  • Add docstrings and API docs for any new/modified user-facing classes and functions
  • New/modified features documented in docs/source/*
  • Changes documented in CHANGES.md
  • GitHub CI passes
  • AppVeyor CI passes
  • Test coverage remains or increases (target 100%)

- Expand the STAC catalogue coverage links to provide links for
  multiple formats.

- For TIFF and PNG OGC coverages, apply the squeeze method to remove
  any singleton dimensions.
@pont-us pont-us self-assigned this Oct 23, 2023
- Add support for the scale-factor parameter to the main coverage
  endpoint.

- Improve handling of bbox parameters: they are now applied to
  the axes in the order that they appear in the dataset, as required
  in the specification. Inverted y-axes are also handled.

- Handle half-open datetime intervals.

- More parameter checking: various invalid parameters will now
  result in a 400 response rather than a server-side error.
For the coverages endpoint, scale-axes, scale-size, subset-crs, and
bbox-crs are not yet implemented. The server now makes this explicit
by returning an HTTP 501 Not Implemented response if any of these
parameters are supplied.
@pont-us pont-us changed the title Coverages: improve STAC links; squeeze dimensions Implement STAC and OGC API compatibility improvements Oct 24, 2023
The improved bbox parameter handling for coverages had mistakenly
also retained the previous implementation, causing incorrect
behaviour. This commit removes the old implementation.
This makes it easier for clients to find a coverage in a particular
format.
If a client attempts supplies a coverage subset parameter specifying
an axis that doesn't exist in the dataset, the server now returns
a 400 bad request response rather than a server error.
- Strip quotation marks from time specifiers in coverages
  subset parameter, if present.

- Check more robustly for invalid axis names (some cases that
  previously produced server errors now produce a 400 Bad Request
  response).
Previously, a dataset's CRS was reported on the basis of the
spatial_ref attribute of any crs and spatial_ref variables in
the dataset. Now the crs_wkt attribute is also examined if
the spatial_ref attribute is absent.
- Give a 400 Bad Request response on requests for non-existent
  properties (i.e. variables).

- Change default subsetting and bbox CRS from EPSG:4326 to
  OGC:CRS84.

- Implement (as yet untested) reprojection for bbox.
Initial implementation was extremely inefficient; bbox now works better
for large datasets.
- Run datetime parameter values through ensure_time_index_compatible
  function to fix timezone-aware / -naive incompatibilities

- Handle quotations around single time values when subsetting
  (previously only done for ranges)

- If writing an image from a Dataset with a single variable, write
  the relevant DataArray rather than the whole Dataset (avoids
  errors due to additional "bnds" dimensions).
- Handle subset indices with quotation marks.

- Ensure that time-based subset specifiers have the same timezone
  awareness as the datasets they are indexing.

- Produce a more informative error for an unparseable subset specifier.
bboxes are now re-applied after transformation into the final CRS to
ensure that the crop is as small as possible.
The OGC API - Coverages implementation now allows more flexibility in
axis specification; in particular, axis identifiers from the CRS can
be used. Handling of latitude ranges has also been improved.
As a final fallback, subsetting in OGC API - Coverages support now
tries interpreting the subsetting axis name directly as the name
of a dataset dimension.
The OGC API Coverages implementation now produces a "Bad Request"
response if the client requests a coverage with zero size, or one
larger than a predefined limit.
Subsetting behaviour now conforms more closely to the specification,
performing geographical subsetting in the subsetting CRS rather than
the native CRS.
Subsetting now handles geographic subset specifiers more cleanly,
and reapplies geographic subsetting in the final CRS to ensure that
the returned area is no larger than necessary. Some coverages code
has also been refactored.
Previously, requests for too large coverages produced a general
"400 Bad Request" response. They now produce the more appropriate
"413 Content Too Large".
The main coverages endpoint now includes the required Content-Bbox
and Content-Crs headers in its response.
The get_coverage_data method and the implementing methods that
it calls have become increasingly complex and is becoming difficult
to maintain as more of the Coverages draft standard is implemented.
In part, this is because it mixes syntactic parsing of the request
parameters with the actual processing operations.

This commit adds a class CoveragesRequest which parses a coverage
HTTP request into a clean Python data structure. So far it is not
being used: a subsequent commit will update get_coverage_data et
al. to work with a pre-parsed CoveragesRequest instance rather than
values extracted directly from the HTTP request.
- Fix a bug which caused ordering of bbox elements to be interpreted
  incorrectly in some cases.

- Add/expand unit tests to improve test coverage.
The OGC API - Coverages implementation no longer tries to map
geographic subsetting axis specifiers onto corresponding dataset
dimensions; instead, it just parses their names directly.
Determination of the relevant dataset dimensions is now done after
transformation of the geographic subsetting bounding box into the
dataset's native CRS. The _crs_axis_name_to_dim_name has been
removed.
@pont-us pont-us marked this pull request as ready for review December 1, 2023 10:57
Copy link
Contributor

@TonioF TonioF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not verify that your pr solves every single bullet listed in the issue - it looks good in general, though. Still, I could dig up some things you might want to look at.

CHANGES.md Outdated Show resolved Hide resolved
CHANGES.md Outdated Show resolved Hide resolved
xcube/webapi/ows/coverages/util.py Outdated Show resolved Hide resolved
xcube/webapi/ows/coverages/util.py Outdated Show resolved Hide resolved
xcube/webapi/ows/coverages/controllers.py Outdated Show resolved Hide resolved
xcube/webapi/ows/coverages/controllers.py Show resolved Hide resolved
xcube/webapi/ows/stac/context.py Outdated Show resolved Hide resolved
test/webapi/ows/stac/test_controllers.py Outdated Show resolved Hide resolved
xcube/webapi/ows/stac/controllers.py Show resolved Hide resolved
xcube/webapi/ows/stac/controllers.py Outdated Show resolved Hide resolved
pont-us and others added 9 commits December 4, 2023 12:10
Co-authored-by: Tonio Fincke <[email protected]>
- Rename the util module to request.

- Prefix various non-public implementation methods of CoveragesRequest
  with _.
If a client makes a coverage request including the datetime parameter
when the dataset has no time dimension, the parameter will be ignored.
Previously, an error was raised.
Several minor changes requested in PR review:

- Update docstring for get_coverage_data

- Remove a commented-out line of code

- rename _correct_inverted_y_range to
  _correct_inverted_y_range_if_necessary

- Minor refactoring of_transform_bbox
@pont-us pont-us requested a review from TonioF December 4, 2023 16:33
Copy link
Contributor

@TonioF TonioF left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm good now :) .

Copy link
Contributor

@thomasstorm thomasstorm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only a question/suggestion, otherwise approved. Nice work!

xcube/webapi/ows/coverages/controllers.py Show resolved Hide resolved
Copy link
Member

@forman forman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good! Please review my comments.

test/webapi/ows/coverages/test_request.py Outdated Show resolved Hide resolved
xcube/webapi/ows/coverages/controllers.py Outdated Show resolved Hide resolved
Comment on lines 36 to 38
# TODO: decide what else to include in this list.
# Listing all 11396 CRSs currently known to pyproj seems
# impractical. Make it a configuration option, perhaps?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this really a TODO? If so, say why this is needed.

Actually I don't understand why we limit it to geographic. xcube can reproject to any CRS, see tiles generation.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've downgraded it to a "check".

@pont-us pont-us merged commit c929727 into master Dec 7, 2023
1 check passed
@pont-us pont-us deleted the pont-904-stac-ogc-fixes branch December 7, 2023 14:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

STAC and OGC API compatibility improvements
4 participants