Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Flip outlier indicator and use longer value fields in scripts/export.csv #115

Conversation

jeancochrane
Copy link
Collaborator

@jeancochrane jeancochrane commented Apr 1, 2024

This PR fixes two problems with scripts/export.csv introduced in #114 that we noticed after a test upload:

  1. Indicator values were flipped incorrectly such that Y corresponded to outliers and N corresponded to non-outliers, which is the opposite of the export spec
    • This PR flips the use of these fields to match the spec
  2. Using the short value for the Model Determination field (Y or N) resulted in the field being writeable after export
    • This PR switches to using the long value to make the fields read-only

Testing

Test instructions are the same as #114:

python3 -m venv venv
source venv/bin/activate
pip install -r manual_flagging/requirements.txt
python3 scripts/export.py > sales_val_flags.csv

You may have to run aws-mfa if you haven't already today.

@@ -20,6 +20,9 @@
OUTLIER_TYPE_FIELD = "USER27"
RUN_ID_FIELD = "USER28"

IS_OUTLIER_INDICATOR = "N: Non arms-length"
NON_OUTLIER_INDICATOR = "Y: Is arms-length"

OUTLIER_TYPE_CODES = {
"Not outlier": "0",
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to use the long value for the outlier type, in order to make that field read-only? I notice that the iasWorld data model does not have an "analyst determination reason" field in addition to "model determination reason", as "model determination"/"analyst determination" does, so perhaps we want this field to be writeable, but I also wonder if that was an oversight in our iasWorld schema design.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do actually want the determination reason to be writable, as it's basically shared between the model / analyst determination fields.

& (flag_df[OUTLIER_TYPE_FIELD] == OUTLIER_TYPE_CODES["Not outlier"])
].empty, (
f"{OUTLIER_TYPE_FIELD} cannot be {OUTLIER_TYPE_CODES['Not outlier']} "
f"when {IS_OUTLIER_FIELD} is Y"
f"{OUTLIER_TYPE_FIELD} cannot be '{OUTLIER_TYPE_CODES['Not outlier']}' "
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added enclosing quotation marks for a slightly more readable error message.

@jeancochrane jeancochrane marked this pull request as ready for review April 1, 2024 16:58
@jeancochrane jeancochrane requested a review from dfsnow April 1, 2024 16:58
@@ -20,6 +20,9 @@
OUTLIER_TYPE_FIELD = "USER27"
RUN_ID_FIELD = "USER28"

IS_OUTLIER_INDICATOR = "N: Non arms-length"
NON_OUTLIER_INDICATOR = "Y: Is arms-length"

OUTLIER_TYPE_CODES = {
"Not outlier": "0",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do actually want the determination reason to be writable, as it's basically shared between the model / analyst determination fields.

Comment on lines +23 to +25
IS_OUTLIER_INDICATOR = "N: Non arms-length"
NON_OUTLIER_INDICATOR = "Y: Is arms-length"

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I feel like this has the potential to become incredibly confusing. Let's all align on the correct indicator meaning/direction this afternoon and keep this PR open until then.

@jeancochrane
Copy link
Collaborator Author

This PR has been open for a long time and we're planning to do a major refactor to this data model soon anyway, so I'm going to close it for now.

@jeancochrane jeancochrane deleted the jeancochrane/make-model-reason-readonly-in-export-script branch May 16, 2024 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants