ingest: Update default config params for strain #57

joverlee521 · 2024-07-24T19:56:47Z

Current default configs for strain requires it to be a non-empty string with accession as the backup field.

pathogen-repo-guide/ingest/defaults/config.yaml

Lines 65 to 69 in 89b3c5d

    
           # Standardized strain name regex 
        
           # Currently accepts any characters because we do not have a clear standard for strain names across pathogens 
        
           strain_regex: "^.+$" 
        
           # Back up strain name field to use if "strain" doesn"t match regex above 
        
           strain_backup_fields: ["accession"]

This is a historical artifact from when augur required the strain/name column as a metadata id column. Now that we support arbitrary --metadata-id-column across subcommands since Augur 22.1.0, these default config params seem to do more harm than good (e.g. nextstrain/augur#1556 (comment)).

Possible solutions

Remove the default backup field.
Remove augur curate transform-strain-name and related config params completely since they aren't doing much here if we remove the default backup field.

The text was updated successfully, but these errors were encountered:

joverlee521 · 2024-07-24T20:11:57Z

I slightly lean towards [1] in case NCBI Datasets does start to output the strain field, then we can use strain and isolate-lineage as the backup field.

genehack · 2024-07-25T16:13:37Z

The default params already allow any value other than the empty string, yeah?

joverlee521 · 2024-07-25T17:48:03Z

The default params already allow any value other than the empty string, yeah?

Ah yeah, we just need to remove the default backup field then. Updated issue.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ingest: Update default config params for strain #57

ingest: Update default config params for strain #57

joverlee521 commented Jul 24, 2024 •

edited

Loading

joverlee521 commented Jul 24, 2024

genehack commented Jul 25, 2024

joverlee521 commented Jul 25, 2024

ingest: Update default config params for strain #57

ingest: Update default config params for strain #57

Comments

joverlee521 commented Jul 24, 2024 • edited Loading

Possible solutions

joverlee521 commented Jul 24, 2024

genehack commented Jul 25, 2024

joverlee521 commented Jul 25, 2024

joverlee521 commented Jul 24, 2024 •

edited

Loading