Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GH Action for NCBI phylogenetic workflow #61

Merged
merged 6 commits into from
Jun 21, 2024
Merged

Conversation

joverlee521
Copy link
Contributor

@joverlee521 joverlee521 commented Jun 21, 2024

Description of proposed changes

Add GH Action workflows for the phylogenetic builds

  1. The Phylogenetic NCBI workflow uses the public NCBI data for the full genome build and is able to run completely within the docker runtime on GH Actions.

  2. The Phylogenetic Fauna workflow uses the private fauna data for the default builds. I didn't check if it would complete within GH Actions and just followed the README instructions to run via aws-batch.

It's unclear how much manual curation is needed for NCBI ingest data before running builds so both need to be manually triggered for now.

Related issue(s)

Resolves #60

Checklist

  • Checks pass
  • Trial NCBI run -> build uploaded to staging
  • Trial fauna run -> builds uploaded to staging (follows pattern of https://nextstrain.org/staging/avian-flu/trials/gh-action-phylo/avian-flu/<subtype>/<segment>/<time>)

This is a shared rule between the default builds and the full genome
build and it works because both Snakemake workflows independently
define their own `all` rule.

As usual, this requires AWS credentials to upload the Auspice JSONs
to the nextstrain-data S3 bucket.
Currently runs the full genome build and automatically uploads the
resulting build to `s3://nextstrain-data`.
Allows us to test builds and deploy them to staging instead of
production.
Will make edits in subsequent commit to reflect the default fauna build
I think the diff in this commit warrants maintaining separate
GH Action workflows for NCBI vs fauna data builds instead of trying to
shoehorn both into a single complicated GH Action workflow.

Uses AWS Batch runtime with cpus/memory according to the repo's README
instructions.¹

¹ <https://github.com/nextstrain/avian-flu/blob/4ae5e32af0f00b3120d040bfd0b527a4b2c51c09/README.md?plain=1#L14>
@joverlee521 joverlee521 requested a review from a team June 21, 2024 21:33
@joverlee521 joverlee521 merged commit f864158 into master Jun 21, 2024
14 checks passed
@joverlee521 joverlee521 deleted the gh-action-phylo branch June 21, 2024 22:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add GH Action for phylogenetic workflows
2 participants