-
Notifications
You must be signed in to change notification settings - Fork 42
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
datastreams: affiliations: add EDMO organizations #435
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewed with ❤️ by Nico and Yash
d077b16
to
506aa82
Compare
sparql.setQuery(self._query) | ||
sparql.setReturnFormat(JSON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what in the Java(Script) 👀
setup.cfg
Outdated
# Upper pinning pycountry due to commonmeta-py | ||
pycountry>=22.3.5,<23.0.0 | ||
PyYAML>=5.4.1 | ||
regex>=2024.7.24 | ||
rdflib>=7.0.0 | ||
SPARQLWrapper>=2.0.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: we make these dependencies extras, since they might be complicating the dependency graph and cause trouble downstream.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, we only need to add the new dependencies to extras (I can have a go at it, since it'll require some imports mangling)
506aa82
to
fe2883c
Compare
* Moves `rdflib` and `SPARQLWrapper` to extras.
f761656
to
595c4f8
Compare
❤️ Thank you for your contribution!
Fixes #433
Description
The pull requests provides a new datastream to import affiliations from the European Directory of Marine Organisations (EDMO).
There is no official mapping between EDMO and ROR, so for now we are importing them separately with a different main ID scheme.
There is no downloadable RDF file containing all the data, so we need to call a SPARQL endpoint in order to import the data.
The data can be imported with the following command:
$ invenio vocabularies import --vocabulary affiliations:edmo --origin https://edmo.seadatanet.org/sparql/sparql EDMOOrganizationTransformer: ["No alpha_2 country found for: {'org': {'type': 'uri', 'value': 'https://edmo.seadatanet.org/report/1051'}, 'name': {'type': 'literal', 'value': 'UNKNOWN'}, 'countryName': {'type': 'literal', 'value': 'Unknown'}, 'locality': {'type': 'literal', 'value': 'UNKNOWN'}, 'deprecated': {'type': 'literal', 'datatype': 'http://www.w3.org/2001/XMLSchema#boolean', 'value': 'false'}}"] Vocabulary affiliations:edmo imported. Total items 5656. 5655 items succeeded 1 contained errors 0 were filtered.
Affiliations organizations autocomplete showing ROR and EDMO organizations side by side:
Checklist
Ticks in all boxes and 🟢 on all GitHub actions status checks are required to merge:
Frontend
Reminder
By using GitHub, you have already agreed to the GitHub’s Terms of Service including that: