Skip to content

v1.0.0 - stable refactorized and enhanced version with complete docs and examples

Compare
Choose a tag to compare
@sreichl sreichl released this 09 Dec 11:39
· 27 commits to main since this release
02c128a

Enhancements

  • Refactorize pipeline, reduce scope/remove redundancy to dedicated downstream analysis modules, and integrate into MR.PARETO for v1.0.0.
  • Abstract everything CeMM/BSF specific to the configuration level to make the pipeline more accessible.
  • Revamp the MultiQC report to improve clarity and utility.
  • Adapt result folder structure and reporting to conform with MR.PARETO standards.
  • Simplify pipeline configuration to only two files (instead of 4) for ease of use.
  • Ensure compatibility with Snakemake version 7.
  • Implement quick aggregation of all quantifications, reducing processing time from mutliple hours to a few minutes.
  • Make the slop_extension parameter configurable to accommodate different user needs.
  • Aggregate sample-wise HOMER known motif enrichment results for easier downstream analysis.
  • Add promoter and TSS region quantification features to the pipeline.
  • Extract and present MultiQC statistics in a more accessible format.

Documentation

  • Provide a tutorial and tips for quality control (QC).
  • Add example microdata for human (hg38) and mouse (mm10) genomes to assist new users.
  • Include a guide for resource downloading for required hg38 and mm10 data.
  • Enable sharing of the genome browser track hub for collaborative work.
  • Document the configuration process for UROPA within the pipeline.
  • Point to downstream analysis modules.
  • Adapt and extend Methods accordingly to refactorization, reduced scope and new features.

Small improvements

  • Fix the mitochondrial fraction metric in the pipeline report.
  • Create and add versioning to all environment YAML files.
  • Make the installation of Homer a dedicated rule/job within the pipeline.
  • Fix the multiqc.yaml installation error to ensure smooth setup.
  • Remove temporary BAM files before Bowtie execution to save disk space and improve performance.
  • Provide a consensus region annotation file, expanded by nucelotide content information.
  • Clean up bash commands in all rules to improve code quality and maintainability.
  • Split the UROPA region annotation rule to allow for parallel processing and increased efficiency.
  • Tested the pipeline on 100 ATAC-seq samples as a loaded module within another Snakemake workflow project to ensure compatibility and robustness.

Full Changelog: v0.1.2...v1.0.0