Skip to content

Managing multiple projects

markrobinsonuzh edited this page Mar 11, 2019 · 9 revisions

Here, we outline different ways to manage data and software if you want to run the ARMOR workflow on more than one project or data set:

  1. Keep an ARMOR repository and the data of each project together in a single directory. In this way, the software (the Snakefile, the scripts, the Rmd files and the config.yaml) and data from each project are contained in a single directory. The configuration of the workflow will be physically separated for each project and thus, it will be very easy to reproduce results. However, you will have ARMOR in multiple physical locations, which means the installed software might be duplicated if you are using the --use-conda option, which would make a conda environment in that directory (e.g., .snakemake/conda/7a4f9e69).

  2. Clone the ARMOR repository only once and have a separate directory for each project. In this way, the ARMOR directory can be reused for many different projects. This might be useful if you do not want to recreate conda environments for each project (e.g., create a environment and activate it) and will be using the same Snakefile and scripts for every project. In this case, you will need a different config.yaml file for each project (either in the ARMOR directory or in each project directory.). You will have to specify the path to the config.yaml file every time you want to run the workflow (e.g., snakemake --configfile projectX/config.yaml).

Further details can be found at the Running the analysis page.