-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Synonym Sync: MONDO:GENERATED
edge cases
#745
Comments
@joeflack4 the pipeline should not be generating a synonym "gvhd -". |
@twhetzel I'll add this to our next Thursday agenda. But if we have time, maybe we can talk about it at the tech call. IDK if Nico will be here, but he designed |
We have several options: The original synonym is: GVHD - [graft-versus-host disease] We need to remove patterns of Note that current the regex for this matches on strings ending with this pattern. |
The query needs to stay because it does fix a value for one of the external ontologies that is processed. As is, it's causing problems for icd11.foundation. |
I would probably try and improve the query a bit. |
@matentzn given the synonyms for Graft-versus-host disease which are:
are you suggesting that any synonyms that contains " - " should be split and treated as two synonyms? As compared to modifying or using a different sparql query than |
I have not really though this through; I am thinking of it from an NLP perspective more than anything else. I would want to see the following synonyms
I don't so much see the purpose for
Where both the acronym and the spelt out name are combined. That said, however, there is value keeping the synonyms exactly as the source has it, so there is trade offs. The most comprehensive would be to add these as synonyms:
Just suggestions. If you feel this is too much right now, just drop the query from ICD11 processing? |
Yes, as a first pass to make sure the "-added" synonyms from the Synonym Sync pipeline can be included in the February Mondo release I would like to keep this simple and if removing the For a future Mondo release (March?), if we want to explore if another query would provide a more meaningful and comprehensive list of synonyms for ICD11 and/or any other ontology that also sounds fine. That will take a little more coding work and more curation review so I would rather not push for this happen this week. I do want to make sure that the "-added" synonym content is ready overall to be added into Mondo for the February release. |
MONDO:GENERATED
MONDO:GENERATED
exceptions
MONDO:GENERATED
exceptionsMONDO:GENERATED
edge cases
I like the solution of for now just removing There's probably other patterns to consider too, not just |
After reviewing the "ADDED" synonyms file, there were some ICD11 formatting issues where text like The full build from PR #756 is #757. If we agree the build looks good, then #756 can be merged into #749. Examples of changed synonym labels after processing ICD11 with the query:
|
Overview & background
Trish discovered an
-added
synonym "gvhd -" that did not appear (icd11foundation:437372167
) in the source.That's because this is a
MONDO:GENERATED
synonym. But the robot templates don't include that information in thesynonym_type
column because we had decided we did not want to importMONDO:GENERATED
synonymType annotations.There's also the issue that the generated "gvhd -" does not appear quite as we would hope. There are two issues with it:
-
fix-labels-with-brackets.ru
doesn't realize this, so it lowercases it.Sub-tasks
From Trish - The first sub-task is to understand why this is happening, what the synonym should look like, and the various options that can be used to fix the issue. There are many steps along the way of processing ICD11 where this could be resolved. I don't see either of these as the way to go at this point.
- [ ] 1. Consider adding additional, curator-only columnis_mondo_generated
- Alternatively we could just importMONDO:GENERATED
as a synonym type, but we decided not to before.- [ ] 2. Decide whether curation is needed forMONDO:GENERATED
synonyms before we import them- If not, decide whether or not to do anything programmatically to prevent these kinds of garbled synonyms from appearing.The text was updated successfully, but these errors were encountered: