-
Notifications
You must be signed in to change notification settings - Fork 163
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Auspice testing data (get-data) #1558
Comments
Note that the PR review apps allow us to test on live nextstrain.org data |
That's true (and a big positive) but it doesn't help with local development and, some time in the future, actual tests within auspice. I'll take a crack at this issue today - it seems simple on the surface! |
jameshadfield
added a commit
that referenced
this issue
Aug 9, 2023
The test dataset covers a large range of genome complexity which is much better than relying on datasets (I don't have any to hand with negative CDSs wrapping around the origin, or -ve segmented CDSs, for example). The addition of this starts to address #1558 <#1558> 3 automated tests are passing, with the rest failing as expected. Subsequent commits will add support for segmented CDSs and -ve strand CDSs. Full test results: ✓ Chromosome coordinates ✓ +ve strand CDS with a single segment ✕ -ve strand CDS with a single segment ✓ +ve strand CDS which wraps the origin ✕ -ve strand CDS which wraps the origin ✕ +ve strand CDS with multiple (non-wrapping) segments ✕ -ve strand CDS with multiple (non-wrapping) segments
jameshadfield
added a commit
that referenced
this issue
Aug 10, 2023
The test dataset covers a large range of genome complexity which is much better than relying on datasets (I don't have any to hand with negative CDSs wrapping around the origin, or -ve segmented CDSs, for example). The addition of this starts to address #1558 <#1558> 3 automated tests are passing, with the rest failing as expected. Subsequent commits will add support for segmented CDSs and -ve strand CDSs. Full test results: ✓ Chromosome coordinates ✓ +ve strand CDS with a single segment ✕ -ve strand CDS with a single segment ✓ +ve strand CDS which wraps the origin ✕ -ve strand CDS which wraps the origin ✕ +ve strand CDS with multiple (non-wrapping) segments ✕ -ve strand CDS with multiple (non-wrapping) segments
jameshadfield
added a commit
that referenced
this issue
Aug 10, 2023
The test dataset covers a large range of genome complexity which is much better than relying on datasets (I don't have any to hand with negative CDSs wrapping around the origin, or -ve segmented CDSs, for example). The addition of this starts to address #1558 <#1558> 3 automated tests are passing, with the rest failing as expected. Subsequent commits will add support for segmented CDSs and -ve strand CDSs. Full test results: ✓ Chromosome coordinates ✓ +ve strand CDS with a single segment ✕ -ve strand CDS with a single segment ✓ +ve strand CDS which wraps the origin ✕ -ve strand CDS which wraps the origin ✕ +ve strand CDS with multiple (non-wrapping) segments ✕ -ve strand CDS with multiple (non-wrapping) segments
jameshadfield
added a commit
that referenced
this issue
Aug 11, 2023
The test dataset covers a large range of genome complexity which is much better than relying on datasets (I don't have any to hand with negative CDSs wrapping around the origin, or -ve segmented CDSs, for example). The addition of this starts to address #1558 <#1558> 3 automated tests are passing, with the rest failing as expected. Subsequent commits will add support for segmented CDSs and -ve strand CDSs. Full test results: ✓ Chromosome coordinates ✓ +ve strand CDS with a single segment ✕ -ve strand CDS with a single segment ✓ +ve strand CDS which wraps the origin ✕ -ve strand CDS which wraps the origin ✕ +ve strand CDS with multiple (non-wrapping) segments ✕ -ve strand CDS with multiple (non-wrapping) segments
jameshadfield
added a commit
that referenced
this issue
Aug 17, 2023
The test dataset covers a large range of genome complexity which is much better than relying on datasets (I don't have any to hand with negative CDSs wrapping around the origin, or -ve segmented CDSs, for example). The addition of this starts to address #1558 <#1558> 3 automated tests are passing, with the rest failing as expected. Subsequent commits will add support for segmented CDSs and -ve strand CDSs. Full test results: ✓ Chromosome coordinates ✓ +ve strand CDS with a single segment ✕ -ve strand CDS with a single segment ✓ +ve strand CDS which wraps the origin ✕ -ve strand CDS which wraps the origin ✕ +ve strand CDS with multiple (non-wrapping) segments ✕ -ve strand CDS with multiple (non-wrapping) segments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Background
This repo doesn't contain any datasets beyond some very minimal examples to explain the dataset format. Instead we rely on the get-data script which downloads a slew of (nextstrain core) datasets. In times gone by, this was an accurate listing of all of our core datasets (and other sources didn't exist -- groups, community etc). This script is often run manually (e.g.
npm run get-data
) so you can have some data to play with, and heroku runs this during setup (npm run heroku-postbuild
) which results in usable datasets in review apps.Shortcomings
A lot of auspice's functionality cannot be tested with the data here and two PRs in the last couple of weeks have highlighted this: #1557 and #1552. This means the heroku-review apps are not useful, and people have to manually checkout the auspice branch and obtain an appropriate dataset for testing.
Proposal
We should make the
get-data
script obtain a useful set of testing datasets, preferably using timestamped datasets so we can ensure reproducibility. For PRs which need additional datasets to test, these should be added to the get-data script as part of the PR.The text was updated successfully, but these errors were encountered: