-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
make queryables and summaries automatically updatable #18
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably start a CHANGELOG to keep track of updates and features applied. Could also start versioning properly.
discover_summaries.sql
Outdated
JOIN LATERAL jsonb_each(properties) ON TRUE | ||
JOIN LATERAL jsonb_array_elements( | ||
CASE jsonb_typeof(value) | ||
WHEN 'array' THEN | ||
value | ||
ELSE | ||
jsonb_build_array(value) | ||
END | ||
) AS a ON TRUE | ||
-- see https://github.com/stac-extensions/timestamps | ||
WHERE key NOT IN ('created', 'updated', 'published', 'expires', 'unpublished', 'datetime', 'start_datetime', 'end_datetime') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious about the result of more complex structures. How are they handled?
I'm thinking of, for example, bands
/eo:bands
/raster:bands
(https://github.com/radiantearth/stac-spec/blob/master/commons/common-metadata.md#bands, https://github.com/stac-extensions/eo, https://github.com/stac-extensions/raster) or mlm:inputs
/mlm:outputs
/mlm:hyperparameters
(https://github.com/stac-extensions/mlm) that have JSON arrays of nested objects.
Same question applies for queryables.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that's a good question. Right now the summaries will show them as an array of objects (in queryables it's also an array of objects under an "enum" key).
It might be nice in the future to add special handling cases for bands and other common object structures. The tricky part is identifying them since there are many different property names from different stac extensions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the expectation of /queryables
would be that unusual or complicated structures that cannot be directly queried would not be listed. For the time being, I don't think STAC queries offers an official way to search nested objects, beside maybe some convoluted CQL2 filter?
In most cases, I believe those complicated structures are more informative metadata than expected to be queryable. So, maybe just omit them from /queryables
would be best?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This code tries to replicate the behaviour we had before but if I have time I can try to handle this in a nicer way.
beside maybe some convoluted CQL2 filter
This app implements the Filter extension so it supports CQL2 syntax
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had to double-check to be sure: https://github.com/stac-api-extensions/filter#queryables
I had a memory this was still in "not supported state".
See emphasized text parts:
Queryables can also be used to advertise "synthesized" property values. The only requirement in CQL2 is that the property have a type and evaluate to literal value of that type or NULL. For example, a filter like "Items must have an Asset with an eo:band with the common_name of 'nir'" can be expressed. A Queryable assets_bands could be defined to have a type of array of string and have the semantics that it contains all of common_name values across all assets and bands for an Item. This could then be filtered with the CQL2 expression 'nir' in assets_bands. Implementations would then expand this expression into the appropriate query against its datastore. (TBD if this will actually work or not. This is also related to the upcoming restriction on property/literal comparisons)
An implementation may also choose not to advertise any queryables, and provide the user with out-of-band information or simply let them try querying against fields. While this is not allowed according to the OGC CQL2 Queryable spec, it is allowed in STAC API by the Filter Extension.
Somewhat still free for all.
To be compliant, I guess /queryables
not explicitly handled (objects/array) would have to be omitted, especially if not actually processed by the code in any special way. It is counter-productive for servers that rely on it.
The filter can still allow users to try some complex JSON-path syntax within a CQL2 expression, but /queryables
doesn't advertise any "guarantee". The best really would be for those edge cases to have custom properties to facilitate filtering.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I defer to Francis review.
Previously this app implemented a custom
/queryables
endpoint that crawled the database to display information about the items stored in the database. This method has some limitations:This PR improves on this method by introducing postgres functions to collect the same queryables information from the database and store it in the
queryables
table. This caches the queryables information and allows the default/queryables
endpoint function to get the same information quickly for a single collection or for all collections.A similar strategy is also implemented here to ensure that the collection summaries and extents are kept up to date.
See the updated README for more detailed information.
To test this in birdhouse:
docker build -t tmp-stac-all:local-test .
birdhouse/components/stac/default.env
update:export STAC_IMAGE='tmp-stac-all:local-test'
/queryables
endpoints to see what is there by defaultextent
andsummaries
sections from the/collections
endpoint to see what is there by defaultPATCH /queryables
andPATCH /summaries
(you may need admin permissions since magpie permissions are set to deny non-GET requests)Hint: to send PATCH requests to stac with the admin cookies: