Skip to content

Commit

Permalink
Merge pull request #122 from nextstrain/update-location-threshold
Browse files Browse the repository at this point in the history
Update location threshold
  • Loading branch information
trvrb authored Jan 17, 2025
2 parents f3a3085 + be4ec05 commit 613c6b5
Show file tree
Hide file tree
Showing 3 changed files with 12 additions and 24 deletions.
8 changes: 0 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -96,14 +96,6 @@ The current available options for `geo_resolutions` are
The `prepare_data` params in `config/config.yaml` are used to subset the full
case counts and clades counts data to specific date range, locations, and clades.

As of 2023-04-04, the config for the automated pipeline is set to only include data from:

- the past 150 days
- excluding sequences from the last 12 days since they may be overly enriched for variants
- locations that have at least 500 sequences in the last 30 days
- excluding locations specifically listed in `defaults/global_excluded_locations.txt`
- clades that have at least 5000 sequences in the last 150 days

### Model configurations

The specific model configurations are housed in separate config YAML files or each model.
Expand Down
24 changes: 10 additions & 14 deletions config/config.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -16,39 +16,35 @@ prepare_data:
nextstrain_clades:
global:
included_days: 150
location_min_seq: 50
location_min_seq_days: 30
location_min_seq: 1000
location_min_seq_days: 150
excluded_locations: "defaults/global_excluded_locations.txt"
prune_seq_days: 12
clade_min_seq: 2000
clade_min_seq: 500
clade_min_seq_days: 150
pango_lineages:
global:
included_days: 150
location_min_seq: 150
location_min_seq_days: 30
location_min_seq: 1000
location_min_seq_days: 150
excluded_locations: "defaults/global_excluded_locations.txt"
prune_seq_days: 12
clade_min_seq: 1
clade_min_seq_days: 150
collapse_threshold: 350
open:
nextstrain_clades:
global:
included_days: 150
location_min_seq: 50
location_min_seq_days: 30
location_min_seq: 1000
location_min_seq_days: 150
excluded_locations: "defaults/global_excluded_locations.txt"
prune_seq_days: 12
clade_min_seq: 2000
clade_min_seq: 500
clade_min_seq_days: 150
pango_lineages:
global:
included_days: 150
location_min_seq: 150
location_min_seq_days: 30
location_min_seq: 1000
location_min_seq_days: 150
excluded_locations: "defaults/global_excluded_locations.txt"
prune_seq_days: 12
clade_min_seq: 1
clade_min_seq_days: 150
collapse_threshold: 350
Expand Down
4 changes: 2 additions & 2 deletions viz/src/App.jsx
Original file line number Diff line number Diff line change
Expand Up @@ -31,7 +31,7 @@ function App() {
<p>
Each line represents the estimated frequency of a particular clade through time.
Equivalent Pango lineage is given in parenthesis, eg clade 23A (lineage XBB.1.5). Only
locations with more than 50 sequences from samples collected in the previous 30 days are
locations with more than 1000 sequences from samples collected in the previous 150 days are
included. Results last updated {mlrCladesData?.modelData?.get('updated') || 'loading'}.
</p>
<div id="cladeFrequenciesPanel" class="panelDisplay"> {/* surrounding div(s) used for static-images.js script */}
Expand All @@ -54,7 +54,7 @@ function App() {
<p>
Each line represents the estimated frequency of a particular Pango lineage through time.
Lineages with fewer than 350 observations are collapsed into parental lineage. Only
locations with more than 150 sequences from samples collected in the previous 30 days are
locations with more than 1000 sequences from samples collected in the previous 150 days are
included. Results last updated {mlrLineagesData?.modelData?.get('updated') || 'loading'}.
</p>
<div id="lineageFrequenciesPanel" class="panelDisplay">
Expand Down

0 comments on commit 613c6b5

Please sign in to comment.