-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create Branch review table #6648
base: master
Are you sure you want to change the base?
Conversation
…iative/mondo into branch-review-table
…iative/mondo into branch-review-table
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I made some, but not all comments, as it's hard to read the code now that the affected entity variable has been introduced.
Can you rename all the variables to say affected_entity instead of obsoletion candidate where it makes sense? And also, instead of "filtered" say how, like "non_obsolete_parents". I will Check again tomorrow
src/scripts/branch_review.py
Outdated
|
||
if len(affected_mondo_ids) > 0: | ||
for obsoletion_candidate in affected_mondo_ids: | ||
parent_of_obsolete_candidate = get_parent_from_relations( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parent_of_obsolete_candidate = get_parent_from_relations( | |
parents_of_affected = get_parent_from_relations( |
src/scripts/branch_review.py
Outdated
for x in parent_of_obsolete_candidate | ||
if x not in affected_mondo_ids | ||
] | ||
filtered_parent_of_obsolete_candidate = [ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same variAble name as above?
src/scripts/branch_review.py
Outdated
filtered_parent_of_obsolete_candidate = [ | ||
x | ||
for x in parent_of_obsolete_candidate | ||
if x not in affected_mondo_ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this filter is needed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is needed for lines below. I think my understanding is faulty here and may need clarification.
src/scripts/branch_review.py
Outdated
filtered_ancestor_of_obsolete_candidate = [ | ||
x | ||
for x in ancestors_of_obsolete_candidate | ||
if x not in affected_mondo_ids and x.startswith("MONDO:") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also not innobsoletion candidates
src/scripts/branch_review.py
Outdated
parents_inside_branch = [ | ||
i | ||
for i in parents_inside_branch | ||
if i not in filtered_parent_of_obsolete_candidate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Filter Not needed
src/scripts/branch_review.py
Outdated
parents_outside_branch = [ | ||
i | ||
for i in parents_outside_branch | ||
if i not in filtered_parent_of_obsolete_candidate |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed
I committed just refactors to variable names and some |
What is needed:
|
Reviewing [this spreadsheet]:
Note that the first example I look at, the obsoletion candidate MONDO:0008347 was already obsoleted... I am assuming this is not going to pose any problem since we will run this after each obsoletion round. But I still want to mention it to make sure. |
@sabrinatoro in the second bullet point above:
the term |
src/scripts/branch_review.py
Outdated
|
||
# Column J: Affected Status = status | ||
status = "" | ||
if len(other_parents_in_branch) > 0 and all( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if len(other_parents_in_branch) > 0 and all( | |
if len(other_parents_in_branch) > 0 and any( |
This seems to be the last issue in the table!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks very nice!
src/scripts/branch_review.py
Outdated
if len(other_parents_in_branch) > 0 and any( | ||
" - TO_BE_OBSOLETED" not in parent for parent in other_parents_in_branch | ||
): | ||
status = STAYS_IN_THE_BRANCH |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@sabrinatoro in my opinion, this status should be renamed to: "Retains parent in branch" for clarity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably also want a code that expresses if the class retains an ancestor in he branch, and check that right after this one, so maybe this is a better naming scheme?:
- Retains parent in branch (currently
STAYS_IN_THE_BRANCH
) - Retains ancestor in branch (currently missing)
- Leaves branch but retains parent in other branch (currently
LEAVES_THE_BRANCH
) - Orphaned (leave as is)
@hrshdhgd and I reviewed spreadsheets created from different files (included relaxed, inferred, relaxed/inferred). Here is what we found out:
Solution (that we are testing) is to create 3 spreadsheets:
note: the "affected status" (orphan, leaves branch, stays in branch) should be calculated based on the information in the "combined" spreadsheet. |
@cmungall Here is where SPARQL really suxx. If you can provide me a SQL query to deal with this specific issue, I will start switching to OAK for table reports.
UPDATE:
Using
oaklib
:OR
Pseudo-code