nfcore/rnafusion uses RNA-seq data to detect fusions genes.
The workflow processes RNA-sequencing data from FastQ files. It runs quality control on the raw data (FastQC), detects fusion genes (STAR-Fusion, Fusioncatcher, Ericscript, Pizzly, Squid), gathers information (FusionGDB, Mitelman, COSMIC), visualizes the fusions (FusionInspector), performs quality-control on the results (MultiQC) and finally generates custom summary report witch scored fusions (fusion-report).
Live demo output here.
The pipeline works with both single-end and paired-end data, though not all fusion detection tools work with single-end data (Ericscript, Pizzly, Squid and FusionInspector).
The pipeline is built using Nextflow, a workflow tool to run tasks across multiple compute infrastructures in a very portable manner. It comes with docker / singularity containers making installation trivial and results highly reproducible.
Tool | Single-end reads | CPU (recommended) | RAM (recommended) |
---|---|---|---|
Star-Fusion | Yes | >=16 cores | ~30GB |
FusionCatcher | Yes | >=16 cores | ~64GB |
EricScript | No | >=16 cores | ~30GB |
Pizzly | No | >=16 cores | ~30GB |
Squid | No | >=16 cores | ~30GB |
FusionInspector | No | >=16 cores | ~30GB |
For available parameters or help run:
nextflow run nf-core/rnafusion --help
The nf-core/rnafusion pipeline comes with documentation about the pipeline, found in the docs/
directory:
- Installation
- Pipeline configuration
- Running the pipeline
- Output and how to interpret the results
- Troubleshooting
Use predefined configuration for desired Institution cluster provided at nfcore/config repository.
This pipeline was written by Martin Proks (@matq007) in collaboration with Karolinska Institutet, SciLifeLab and University of Southern Denmark as a master thesis. This is a follow-up development started by Rickard Hammarén (@Hammarn). Special thanks goes to all supervisors: Teresita Díaz de Ståhl, PhD., Assoc. Prof., Monica Nistér, MD, PhD, Maxime U Garcia PhD(@MaxUlysse), Szilveszter Juhos (@szilvajuhos), Phil Ewels (@ewels) PhD and Lars Grøntved, PhD., Assoc. Prof.
- STAR-Fusion: Fast and Accurate Fusion Transcript Detection from RNA-Seq Brian Haas, Alexander Dobin, Nicolas Stransky, Bo Li, Xiao Yang, Timothy Tickle, Asma Bankapur, Carrie Ganote, Thomas Doak, Natalie Pochet, Jing Sun, Catherine Wu, Thomas Gingeras, Aviv Regev bioRxiv 120295; doi: https://doi.org/10.1101/120295
- D. Nicorici, M. Satalan, H. Edgren, S. Kangaspeska, A. Murumagi, O. Kallioniemi, S. Virtanen, O. Kilkku, FusionCatcher – a tool for finding somatic fusion genes in paired-end RNA-sequencing data, bioRxiv, Nov. 2014, DOI:10.1101/011650
- Benelli M, Pescucci C, Marseglia G, Severgnini M, Torricelli F, Magi A. Discovering chimeric transcripts in paired-end RNA-seq data by using EricScript. Bioinformatics. 2012; 28(24): 3232-3239.
- Fusion detection and quantification by pseudoalignment Páll Melsted, Shannon Hateley, Isaac Charles Joseph, Harold Pimentel, Nicolas L Bray, Lior Pachter, bioRxiv 166322; doi: https://doi.org/10.1101/166322
- SQUID: transcriptomic structural variation detection from RNA-seq Cong Ma, Mingfu Shao and Carl Kingsford, Genome Biology, 2018, doi: https://doi.org/10.1186/s13059-018-1421-5
- Fusion-Inspector download: https://github.com/FusionInspector
- Martin Proks. (2019, March 26). matq007/fusion-report: fusion-report:1.0 (Version 1.0). Zenodo. http://doi.org/10.5281/zenodo.2609227
- FastQC download: https://www.bioinformatics.babraham.ac.uk/projects/fastqc/
- MultiQC Ewels, P., Magnusson, M., Lundin, S., & Käller, M. (2016). MultiQC: summarize analysis results for multiple tools and samples in a single report. Bioinformatics , 32(19), 3047–3048. https://doi.org/10.1093/bioinformatics/btw354 Download: https://multiqc.info/