Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GH Action for NCBI phylogenetic workflow #61

Merged
merged 6 commits into from
Jun 21, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions .github/workflows/phylogenetic-fauna.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
name: Phylogenetic Fauna

defaults:
run:
# This is the same as GitHub Action's `bash` keyword as of 20 June 2023:
# https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsshell
#
# Completely spelling it out here so that GitHub can't change it out from under us
# and we don't have to refer to the docs to know the expected behavior.
shell: bash --noprofile --norc -eo pipefail {0}

on:
workflow_dispatch:
inputs:
image:
description: 'Specific container image to use for ingest workflow (will override the default of "nextstrain build")'
required: false
type: string
trial-name:
description: |
Trial name for deploying builds.
If not set, builds will overwrite existing builds at s3://nextstrain-data/avian-flu*
If set, builds will be deployed to s3://nextstrain-staging/avian-flu_trials_<trial_name>_*
required: false
type: string

jobs:
phylogenetic:
permissions:
id-token: write
uses: nextstrain/.github/.github/workflows/pathogen-repo-build.yaml@master
secrets: inherit
with:
runtime: aws-batch
run: |
declare -a config;

if [[ "$TRIAL_NAME" ]]; then
config+=(
deploy_url="s3://nextstrain-staging/avian-flu_trials_${TRIAL_NAME}_"
)
fi;

nextstrain build \
--detach \
--no-download \
--cpus 16 \
--memory 28800mib \
. \
deploy_all \
--config "${config[@]}"

env: |
NEXTSTRAIN_DOCKER_IMAGE: ${{ inputs.image }}
TRIAL_NAME: ${{ inputs.trial-name }}
artifact-name: phylogenetic-fauna-build-output
57 changes: 57 additions & 0 deletions .github/workflows/phylogenetic-ncbi.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
name: Phylogenetic NCBI

defaults:
run:
# This is the same as GitHub Action's `bash` keyword as of 20 June 2023:
# https://docs.github.com/en/actions/using-workflows/workflow-syntax-for-github-actions#jobsjob_idstepsshell
#
# Completely spelling it out here so that GitHub can't change it out from under us
# and we don't have to refer to the docs to know the expected behavior.
shell: bash --noprofile --norc -eo pipefail {0}

on:
workflow_dispatch:
inputs:
image:
description: 'Specific container image to use for ingest workflow (will override the default of "nextstrain build")'
required: false
type: string
trial-name:
description: |
Trial name for deploying builds.
If not set, builds will overwrite existing builds at s3://nextstrain-data/avian-flu*
If set, builds will be deployed to s3://nextstrain-staging/avian-flu_trials_<trial_name>_*
required: false
type: string

jobs:
phylogenetic:
permissions:
id-token: write
uses: nextstrain/.github/.github/workflows/pathogen-repo-build.yaml@master
secrets: inherit
with:
runtime: docker
run: |
declare -a config;

config+=(
s3_src="s3://nextstrain-data/files/workflows/avian-flu/h5n1"
);

if [[ "$TRIAL_NAME" ]]; then
config+=(
deploy_url="s3://nextstrain-staging/avian-flu_trials_${TRIAL_NAME}_"
)
fi;

nextstrain build \
. \
deploy_all \
--snakefile Snakefile.genome \
--config "${config[@]}"

env: |
NEXTSTRAIN_DOCKER_IMAGE: ${{ inputs.image }}
TRIAL_NAME: ${{ inputs.trial-name }}
artifact-name: phylogenetic-full-genome-build-output
9 changes: 9 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,15 @@ nextstrain build --aws-batch --aws-batch-cpus 16 --aws-batch-memory 28800 . --jo

Please see [nextstrain.org/docs](https://nextstrain.org/docs) for details about augur and pathogen builds.

### Deploying builds

The pipeline can automatically deploy resulting builds within the auspice folder
to nextstrain.org by running:

```
nextstrain build . deploy_all
```

## Creating a custom build
The easiest way to generate your own, custom avian-flu build is to use the quickstart-build as a starting template. Simply clone the quickstart-build, run with the example data, and edit the Snakefile to customize. This build includes example data and a simplified, heavily annotated Snakefile that goes over the structure of Snakefiles and annotates rules and inputs/outputs that can be modified. This build, with it's own readme, is available [here](https://github.com/nextstrain/avian-flu/tree/master/quickstart-build).

Expand Down
3 changes: 3 additions & 0 deletions Snakefile
Original file line number Diff line number Diff line change
Expand Up @@ -22,6 +22,9 @@ rule all:
input:
auspice_json = all_targets()

# This must be after the `all` rule above since it depends on its inputs
include: "rules/deploy.smk"

rule test_target:
"""
For testing purposes such as CI workflows.
Expand Down
3 changes: 3 additions & 0 deletions Snakefile.genome
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,9 @@ def subtype(build_name):
rule all:
input: expand("auspice/avian-flu_{build_name}_genome.json", build_name=BUILD_NAME)

# This must be after the `all` rule above since it depends on its inputs
include: "rules/deploy.smk"

rule files:
params:
reference = lambda w: f"config/reference_{subtype(w.build_name)}_{{segment}}.gb",
Expand Down
15 changes: 15 additions & 0 deletions rules/deploy.smk
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
DEPLOY_URL = config.get('deploy_url', "s3://nextstrain-data")


rule deploy_all:
"""
Upload all builds to AWS S3
Depends on indendent Snakemake workflow's defined `all` rule
"""
input: rules.all.input
params:
s3_dst = DEPLOY_URL
shell:
"""
nextstrain remote upload {params.s3_dst:q} {input}
"""