-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Include details of rejected/aborted/timed out annotations in the export #399
Conversation
Add a "teamware_status" section to the export formats detailing which users (if any) have rejected, aborted or timed out annotations on each document. - for "raw" JSON this is a dict added to the top-level JSON object, with properties whose values are a JSON list of ID numbers (for anonymous) or names (for non-anonymous) of the relevant annotators - for "gate" JSON it's the same dict added under the "features" section - for "csv" the lists of IDs/names are flattened into a string the same way as multi-valued annotation elements
The export process adds entries to the doc_dict, so we should clone it first to ensure such changes are not accidentally persisted to the database if the model object is saved after a call to get_doc_annotation_dict.
Jest Coverage
|
Question: as this PR stands it will always add the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks good to me!
Thanks, I'll merge this and then make another PR for a new release. |
I think it's ok to just always export the "teamware_status" field even if all the results are empty and as you said will probably help with null checking. |
Add a section to the export formats detailing which users (if any) have rejected, aborted or timed out annotations on each document.
For JSON this is a dict with properties whose values are a JSON list of ID numbers (for anonymous) or names (for non-anonymous) of the relevant annotators:
In "raw" mode this is added as
"teamware_status"
to the top-leveldoc_dict
, for "gate" mode the"teamware_status"
property is added under"features"
.For CSV export it appears as columns
teamware_status.rejected_by
, etc. with the lists flattened to comma-separated strings exactly as with multi-valued fields from the annotations.This functionality is particularly useful for project managers to identify "difficult" documents that may suggest additional training for the annotators.