Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding missing ABCD elements to ABCD > DwC mapping file #1114

Open
jholetschek opened this issue Jan 28, 2025 · 0 comments
Open

Adding missing ABCD elements to ABCD > DwC mapping file #1114

jholetschek opened this issue Jan 28, 2025 · 0 comments

Comments

@jholetschek
Copy link

jholetschek commented Jan 28, 2025

Currently, a few ABCD elements are not indexed by GBIF. Some of them do map directly to DarwinCore terms, others are repeatable in ABCD and need to be concatenated for their DwC equivalents.

In the mapping file , these six lines should be added

# ABCD elements that have direct DwC equivalent
occurrenceRemarks=Unit/Notes
modified=Unit/DateLastEdited
preparations=Unit/KindOfUnit

# ABCD elements that would need concatenation
recordedByID=Unit/Gathering/Agents/GatheringAgent/ResourceURIs/ResourceURI
identifiedByID=Unit/Identifications/Identification/Identifiers/Identifier/ResourceURIs/ResourceURI
scientificNameID=Unit/Identifications/Identification/Result/TaxonIdentified/ScientificName/ResourceURIs/ResourceURI

@MattBlissett Should I create a PR for these additions - or do you prefer to add them on your own, together with the code to be added?
We'll have test data in our herbariumd dataset soon.

PS: Here's what @timrobertson100 already found out:

It looks like it puts a RawOccurrenceBuilder [1] onto the stack [2], which the rules [3] can then call methods on.
So for example, you can see that the with the collector names [4].

Off the top of my head (and using chatgpt) I think the code will need to be something along the lines of the snippet underneath:

// Create a List on the stack to collect the IDs
digester.addObjectCreate("unit/people", ArrayList.class);

// When we move off the people, call the populateRecordedByID on the Builder (a method you'll add which will do the List to | delimited String conversion)
digester.addSetNext("unit/people", "populateRecordedByID", List.class.getName());

// Add GUIDs to the list
digester.addCallMethod("unit/people/person/guid", "add", 1);
digester.addCallParam("unit/people/person/guid", 0);

[1] https://github.com/gbif/pipelines/blob/dev/sdks/tools/archives-converters/src/main/java/org/gbif/converters/parser/xml/parsing/xml/RawOccurrenceRecordBuilder.java
[2] https://github.com/gbif/pipelines/blob/dev/sdks/tools/archives-converters/src/main/java/org/gbif/converters/parser/xml/parsing/xml/XmlFragmentParser.java#L112
[3] https://github.com/gbif/pipelines/blob/dev/sdks/tools/archives-converters/src/main/java/org/gbif/converters/parser/xml/parsing/xml/rules/Abcd206RuleSet.java
[4] https://github.com/gbif/pipelines/blob/dev/sdks/tools/archives-converters/src/main/java/org/gbif/converters/parser/xml/parsing/xml/rules/Abcd206RuleSet.java#L105

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant