Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include details of rejected/aborted/timed out annotations in the export #399

Merged
merged 3 commits into from
Feb 26, 2024

Conversation

ianroberts
Copy link
Member

Add a section to the export formats detailing which users (if any) have rejected, aborted or timed out annotations on each document.

For JSON this is a dict with properties whose values are a JSON list of ID numbers (for anonymous) or names (for non-anonymous) of the relevant annotators:

{
    "rejected_by": ["ian", "twin"],
    "timed_out": ["david"],
    "aborted": []
}

In "raw" mode this is added as "teamware_status" to the top-level doc_dict, for "gate" mode the "teamware_status" property is added under "features".

For CSV export it appears as columns teamware_status.rejected_by, etc. with the lists flattened to comma-separated strings exactly as with multi-valued fields from the annotations.

This functionality is particularly useful for project managers to identify "difficult" documents that may suggest additional training for the annotators.

Add a "teamware_status" section to the export formats detailing which users (if any) have rejected, aborted or timed out annotations on each document.

- for "raw" JSON this is a dict added to the top-level JSON object, with properties whose values are a JSON list of ID numbers (for anonymous) or names (for non-anonymous) of the relevant annotators
- for "gate" JSON it's the same dict added under the "features" section
- for "csv" the lists of IDs/names are flattened into a string the same way as multi-valued annotation elements
The export process adds entries to the doc_dict, so we should clone it first to ensure such changes are not accidentally persisted to the database if the model object is saved after a call to get_doc_annotation_dict.
Copy link

Jest Coverage

File % Stmts % Branch % Funcs % Lines Uncovered Line #s
All files 83.8 83.96 64 83.8
File % Stmts % Branch % Funcs % Lines Uncovered Line #s
All files 83.8 83.96 64 83.8
_jrpc 94.11 91.66 83.33 94.11
_ index.js 94.11 91.66 83.33 94.11 29-30,38-40
_utils 81.97 82.97 57.89 81.97
_ annotations.js 97.72 73.91 100 97.72 35-36
_ dict.js 88.88 83.33 100 88.88 3-4
_ expressions.js 80.08 82.35 80 80.08 ...,188-190,201-218
_ index.js 73.6 100 14.28 73.6 ...4-65,76-82,93-94

@ianroberts ianroberts requested a review from twinkarma February 23, 2024 13:51
@ianroberts
Copy link
Member Author

Question: as this PR stands it will always add the teamware_status section even though it will be degenerate {"rejected_by":[],"aborted":[],"timed_out":[]} in the majority of cases - would it make more sense to omit the whole teamware_status property from the JSON if all the lists are empty (making the JSON smaller) or leave it as it is now (meaning code processing the JSON doesn't have to worry about null checks)?

Copy link
Collaborator

@twinkarma twinkarma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me!

@ianroberts
Copy link
Member Author

Thanks, I'll merge this and then make another PR for a new release.

@twinkarma
Copy link
Collaborator

Question: as this PR stands it will always add the teamware_status section even though it will be degenerate {"rejected_by":[],"aborted":[],"timed_out":[]} in the majority of cases - would it make more sense to omit the whole teamware_status property from the JSON if all the lists are empty (making the JSON smaller) or leave it as it is now (meaning code processing the JSON doesn't have to worry about null checks)?

I think it's ok to just always export the "teamware_status" field even if all the results are empty and as you said will probably help with null checking.

@ianroberts ianroberts merged commit 834af01 into dev Feb 26, 2024
6 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants