Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ingest: Update default config params for strain #57

Open
joverlee521 opened this issue Jul 24, 2024 · 3 comments
Open

ingest: Update default config params for strain #57

joverlee521 opened this issue Jul 24, 2024 · 3 comments

Comments

@joverlee521
Copy link
Contributor

joverlee521 commented Jul 24, 2024

Current default configs for strain requires it to be a non-empty string with accession as the backup field.

# Standardized strain name regex
# Currently accepts any characters because we do not have a clear standard for strain names across pathogens
strain_regex: "^.+$"
# Back up strain name field to use if "strain" doesn"t match regex above
strain_backup_fields: ["accession"]

This is a historical artifact from when augur required the strain/name column as a metadata id column. Now that we support arbitrary --metadata-id-column across subcommands since Augur 22.1.0, these default config params seem to do more harm than good (e.g. nextstrain/augur#1556 (comment)).

Possible solutions

  1. Remove the default backup field.
  2. Remove augur curate transform-strain-name and related config params completely since they aren't doing much here if we remove the default backup field.
@joverlee521
Copy link
Contributor Author

I slightly lean towards [1] in case NCBI Datasets does start to output the strain field, then we can use strain and isolate-lineage as the backup field.

@genehack
Copy link
Contributor

The default params already allow any value other than the empty string, yeah?

@joverlee521
Copy link
Contributor Author

The default params already allow any value other than the empty string, yeah?

Ah yeah, we just need to remove the default backup field then. Updated issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants