diff --git a/README.md b/README.md index 57cbdd16..5189e7f9 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # ![nf-core/pgdb](docs/images/nf-core-pgdb_logo.png) -**The ProteoGenomics database generation workflow (pgdb) use the pypgatk and nextflow to create different protein databases for ProteoGenomics data analysis.**. +The ProteoGenomics database generation workflow (**pgdb**) use the [pypgatk](https://github.com/bigbio/py-pgatk) and [nextflow](https://www.nextflow.io/) to create different protein databases for ProteoGenomics data analysis. [![GitHub Actions CI Status](https://github.com/nf-core/pgdb/workflows/nf-core%20CI/badge.svg)](https://github.com/nf-core/pgdb/actions) [![GitHub Actions Linting Status](https://github.com/nf-core/pgdb/workflows/nf-core%20linting/badge.svg)](https://github.com/nf-core/pgdb/actions) @@ -36,7 +36,7 @@ The pipeline is built using [Nextflow](https://www.nextflow.io), a workflow tool ```bash - nextflow run nf-core/pgdb -profile --input '*_R{1,2}.fastq.gz' --genome GRCh37 + nextflow run nf-core/pgdb -profile --ensembl_name homo_sapines --ensembl false ``` See [usage docs](https://nf-co.re/pgdb/usage) for all of the available options when running the pipeline. @@ -45,10 +45,12 @@ See [usage docs](https://nf-co.re/pgdb/usage) for all of the available options w By default, the pipeline currently performs the following: - +![ProteoGenomics Database](/docs/images/pgdb-databases.png) -* Sequencing quality control (`FastQC`) -* Overall pipeline run summaries (`MultiQC`) +* Download protein databases from ENSEMBL +* Translate from Genomics Variant databases into ProteoGenomics Databases (`COSMIC`, `GNOMAD`) +* Add to a Reference proteomics database, non-coding RNAs + pseudogenes. +* Compute Decoy for a proteogenomics databases ## Documentation diff --git a/docs/images/pgdb-databases.png b/docs/images/pgdb-databases.png new file mode 100644 index 00000000..14910b43 Binary files /dev/null and b/docs/images/pgdb-databases.png differ