You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
SQL downloads for multi-taxonomy are working in the dev2 environment, but the required SQL is not very user friendly.
Given the importance of querying taxonomically for GBIF, it may serve well to add some syntax support which is easier to understand. This isn't something done in SQL downloads code currently, so would serve as a special case.
Example SQL
SELECT datasetKey, scientificName, decimalLatitude, decimalLongitude
FROM occurrence
WHERE
array_contains(classifications["2d59e5db-57ad-41ff-97d6-11f5fb264527"], "urn:lsid:marinespecies.org:taxname:158970")
SELECT datasetKey, scientificName, decimalLatitude, decimalLongitude
FROM occurrence
WHERE
array_contains(classifications["worms"], "urn:lsid:marinespecies.org:taxname:158970")
Simple to implement, but with a cost of maintaining a mapping.
Allow checklistKey to be specified separately, and then provide backend logic to construct the array_contains statement.
SELECT datasetKey, scientificName, decimalLatitude, decimalLongitude
FROM occurrence
WHERE
checklistKey ="2d59e5db-57ad-41ff-97d6-11f5fb264527"AND
taxonKey ="urn:lsid:marinespecies.org:taxname:158970"
Similar to the current behaviour with HTTP GET services for the Occurrence API. See #342
The addition of the checklistKey assumes that any taxon key queries will use that checklist for matching the taxon key.
The absence of a checklistKey would result in using the default taxonomy, which would still entail a array_contains statement being constructed in the backend. This would require some manipulation of operands.
Custom SQL function
We could support a custom SQL function:
SELECT datasetKey, scientificName, decimalLatitude, decimalLongitude
FROM occurrence
WHERE taxonLookup('2d59e5db-57ad-41ff-97d6-11f5fb264527', 'urn:lsid:marinespecies.org:taxname:158970')
or with default taxonomy (xcol in this example)
SELECT datasetKey, scientificName, decimalLatitude, decimalLongitude
FROM occurrence
WHERE taxonLookup('58VJH')
Support with JOINS
We could populate a taxonomy table in Hive and provide support for taxonomy via joins like so:
SELECTo.datasetKey, o.scientificName, o.decimalLatitude, o.decimalLongitude...
FROM occurrence o
JOIN occurrence_taxonomy ot ONot.occurrenceKey=o.gbifKeyJOIN taxonomy t ONot.taxonKey=t.keyWHEREt.datasetKey='2d59e5db-57ad-41ff-97d6-11f5fb264527'ANDt.scientificNameLIKE'Gadus%'
Similarly
WITH taxa_keys AS (
SELECT key
FROM taxonomy
WHERE datasetKey ='2d59e5db-57ad-41ff-97d6-11f5fb264527'AND scientificName LIKE ´Gadus%´
)
SELECT*FROM occurrence WHERE taxa CONTAINS(taxa_keys.key);
The text was updated successfully, but these errors were encountered:
SQL downloads for multi-taxonomy are working in the
dev2
environment, but the required SQL is not very user friendly.Given the importance of querying taxonomically for GBIF, it may serve well to add some syntax support which is easier to understand. This isn't something done in SQL downloads code currently, so would serve as a special case.
Example SQL
Example with curl
Some proposals:
Simple to implement, but with a cost of maintaining a mapping.
checklistKey
to be specified separately, and then provide backend logic to construct thearray_contains
statement.Similar to the current behaviour with HTTP GET services for the Occurrence API. See #342
The addition of the
checklistKey
assumes that any taxon key queries will use that checklist for matching the taxon key.The absence of a
checklistKey
would result in using the default taxonomy, which would still entail aarray_contains
statement being constructed in the backend. This would require some manipulation of operands.We could support a custom SQL function:
or with default taxonomy (xcol in this example)
We could populate a taxonomy table in Hive and provide support for taxonomy via joins like so:
Similarly
The text was updated successfully, but these errors were encountered: