-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
add facets to collection search #166
Comments
If going for cardinality, we might want to discuss with the rest of the team what the api should look like. Ideas: "facets": [
{
"field": "TYPE",
"cardinality": 4, <==== new field that list the number of facets, not just in the response but in total
"counts": [
{
"name": "CHECKLIST",
"count": 53833
},
{
"name": "OCCURRENCE",
"count": 49485
}
]
}
] other approach use {
"count": 1000,
"limit": 0,
"offset": 0,
"results": [],
"facets": ...,
"cardinality": {
"PUBLISHER_KEY": 1234 <==== distinct publisherKeys within the given search filter
}
} |
specimenFacet seem more difficultE.g. count number of specimens per kingdom facets: [
{kingdomKey: 1, count: 123456} // (from 2 csv rows. one with 123000 individuals and another with 456 individuals)
] individualCount sum across all those descriptors that have that kingdomKey=1 Presentation wise the UI would probably have to show caveats like this for e.g. a kingdom breakdown:
|
Thanks Morten! The collection facets and proposed implementation make a lot of sense. The specimen facets are much more complicated and yes we would have to display a lot of caveats. We would also have to add some other fields to know if the people uploading records have double counted, are exhaustive, etc. |
We could add facets to collection search
2 types of metrics would be possible
specimen facets is the only thing that makes sense within a collection.
both make sense for institutions and grscicoll generally, but currently there isn't any data for it.
examples of collection facet questions:
how many collections have data in spain
how many collections have data about taxon x
how many collections have type specimens of taxon x
which is the most prevalent preservation types for this collection
breakdowns across collections: how many collections per: kingdom, preservation type, country, type specimens, types/country types/kingdom
examples of specimen facet questions:
Which orders does this collection mainly deal with
Breakdown of phyla per country for a collection/institution/total
breakdowns for all: specimens per: kingdom, preservation type, country, type specimens, types/country types/kingdom
We could start with collection facets?
e.g.
?country=ES&country=FR&facet=kingdomKey
same behaviour as normallyThese collection facets is what I'm guessing would be useful: descriptorCountry, country, kingdomKey, phylumKey, ...other taxonGroupKeys..., typeStatus, preservationType, contentType, personalCollection, instititutionKey, active
Ideally we added something new to the API. Namely cardinality of those facets. So an option to, not only get top 10 orders, but also get the number of unique orders. These makes it easier to do UI.
Examples where cardinality is used:
https://grscicoll.hp.gbif-staging.org/specimen/search?layout=W1t7ImlkIjoiYm1tNW8iLCJwIjp7fSwidHJhbnNsYXRpb24iOiJkYXNoYm9hcmQuc3RhdGlzdGljcyIsInQiOiJvY2N1cnJlbmNlU3VtbWFyeSJ9XSxbeyJpZCI6IjE4NGhxIiwicCI6eyJ2aWV3IjoiVEFCTEUifSwidHJhbnNsYXRpb24iOiJmaWx0ZXJzLmNvbGxlY3Rpb25LZXkubmFtZSIsInQiOiJjb2xsZWN0aW9uS2V5In1dXQ%3D%3D&view=DASHBOARD
distinct species
,distinct taxa
in statistics chart + number of results in collection chartThe text was updated successfully, but these errors were encountered: