From 5ca03bdfe59aeabe3a40bf03ec8d59fbb07f681a Mon Sep 17 00:00:00 2001 From: "Ammar S. Naqvi" Date: Tue, 6 Aug 2024 11:38:58 -0400 Subject: [PATCH 1/6] histology-specific readme --- analyses/histology-specific-splicing/README.md | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/analyses/histology-specific-splicing/README.md b/analyses/histology-specific-splicing/README.md index 4a077a1b3..d62d6bdac 100644 --- a/analyses/histology-specific-splicing/README.md +++ b/analyses/histology-specific-splicing/README.md @@ -21,18 +21,22 @@ independent-specimens.rnaseqpanel.primary-plus.tsv * `run_module.sh` shell script to pre-process histology file and run analysis * `01-generate_hist_spec_events_tab.pl` processes rMATs output and identifies unique splicing events if it is in 2% of the histology-specific cohort * `02-plot_histology-specific_splicing_events.R` takes result table frmo above and generates UpSetR plots for each splicing case +* `03-plot-histology-specific-norm-events.R` computes and plots the average number of unique events per histology ## Directory structure ``` . ├── 01-generate_hist_spec_events_tab.pl ├── 02-plot_histology-specific_splicing_events.R +├── 03-plot-histology-specific-norm-events.R +├── README.md ├── plots +│   ├── avg-uniq-hits.pdf │   ├── upsetR_histology-specific.ei.pdf │   └── upsetR_histology-specific.es.pdf ├── results +│   ├── recurrent_splice_events_by_histology.tsv │   ├── unique_events-ei.tsv -│   ├── unique_events-es.tsv -│   └── splicing_events.hist-labeled_list.thr2freq.txt +│   └── unique_events-es.tsv └── run_module.sh ``` From a30996e1324f62a3b8728b200fbfd3f34221ed2c Mon Sep 17 00:00:00 2001 From: "Ammar S. Naqvi" Date: Tue, 6 Aug 2024 11:52:42 -0400 Subject: [PATCH 2/6] cohort summary readme --- analyses/cohort_summary/README.md | 32 +++++++++++++++++++++++++++++++ 1 file changed, 32 insertions(+) create mode 100644 analyses/cohort_summary/README.md diff --git a/analyses/cohort_summary/README.md b/analyses/cohort_summary/README.md new file mode 100644 index 000000000..6358b8167 --- /dev/null +++ b/analyses/cohort_summary/README.md @@ -0,0 +1,32 @@ +# Cohort summary + +Module authors: Ammar Naqvi (@naqvia) + +The purpose of this module is to summarize cohort in our downstream analyses. + +## Usage +
**Run shell script to make final tables to be used for plotting below** +``` +bash run_module.sh +``` +Input files (`data` folder): +``` +histologies.tsv +independent-specimens.rnaseqpanel.primary.tsv +``` + +## Folder content +* `01-generate-cohort-summary-circos-plot.R` creates circos plot of cohort highlighting histology, CNS region and reported gender + +## Directory structure +``` +. +├── 01-generate-cohort-summary-circos-plot.R +├── input +│   └── plot-mapping.tsv +├── plots +│   └── cohort_circos.pdf +└── results + ├── histologies-plot-group.tsv + └── plot_mapping.tsv +``` From db1a27d7c518820ce591f064b91bcb99215b9cf5 Mon Sep 17 00:00:00 2001 From: "Ammar S. Naqvi" Date: Tue, 6 Aug 2024 13:39:02 -0400 Subject: [PATCH 3/6] clk1 morpholino readme --- analyses/CLK1-splicing-impact-morpholino/README.md | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/analyses/CLK1-splicing-impact-morpholino/README.md b/analyses/CLK1-splicing-impact-morpholino/README.md index 658c2bb58..dccbef20a 100644 --- a/analyses/CLK1-splicing-impact-morpholino/README.md +++ b/analyses/CLK1-splicing-impact-morpholino/README.md @@ -43,8 +43,10 @@ bash run_module.sh │   ├── CCMA_crispr_genedependency_042024.csv │   ├── RBP_known.txt │   ├── base_excision_repair.txt +│   ├── cancerGeneList.tsv │   ├── dna_repair_all.txt │   ├── epi_known.txt +│   ├── genelistreference.txt │   ├── homologous_recombination.txt │   ├── mismatch_repair.txt │   ├── morpholno.merged.rmats.tsv @@ -72,6 +74,7 @@ bash run_module.sh │   ├── dPSI_distr.pdf │   ├── des-dex-venn-func.pdf │   ├── des-dex-venn.pdf +│   ├── ds-de-crispr-venn.pdf │   ├── gene-fam-DE-plot.pdf │   ├── gsva_heatmap_dna_repair.pdf │   ├── gsva_heatmap_dna_repair_de.pdf @@ -94,9 +97,9 @@ bash run_module.sh │   ├── ctrl_vs_treated.de.formatted.tsv │   ├── ctrl_vs_treated.de.tsv │   ├── de_genes.tsv +│   ├── dex-sign-goi.tsv │   ├── differential_splice_by_goi_category.tsv │   ├── ds-de-crispr-events.tsv -│   ├── ds-de-crispr-venn.pdf │   ├── expr_collapsed_clk1_ctrl_morpho_dna_repair_gsva_scores.tsv │   ├── expr_collapsed_clk1_ctrl_morpho_hallmark_gsva_scores.tsv │   ├── expr_collapsed_clk1_ctrl_morpho_kegg_gsva_scores.tsv @@ -129,4 +132,4 @@ bash run_module.sh │   ├── splicing_events.morpho.RI.intersectUnip.ggplot.txt │   └── splicing_events.morpho.SE.intersectUnip.ggplot.txt └── run_module.sh -``` \ No newline at end of file +``` From 936005e324dddc7103040c37e4c4734c84d2586f Mon Sep 17 00:00:00 2001 From: "Ammar S. Naqvi" Date: Tue, 6 Aug 2024 13:55:05 -0400 Subject: [PATCH 4/6] functional sites readme --- .../README.md | 30 +++++++++++++++++++ 1 file changed, 30 insertions(+) diff --git a/analyses/splicing_events_functional_sites/README.md b/analyses/splicing_events_functional_sites/README.md index 2255ef74a..34e9fead6 100644 --- a/analyses/splicing_events_functional_sites/README.md +++ b/analyses/splicing_events_functional_sites/README.md @@ -44,3 +44,33 @@ results/splicing_events.total.HGG.neg.intersectUnip.ggplot.txt * `03-format_for_ggplot.sh` formats and appends file into table for plotting * `04-plot_splicing_across_functional_sites.R` generates ggplot violin plots of average dPSI per event identidied overlapping a functional site, outputting to `plots/*png` * `05-plot-splice-patterns` generates plots for visualizing splicing event types into `plots` folder + +## Directory structure +. +├── 01-extract_recurrent_splicing_events_hgg.pl +├── 02-run_bedtools_intersect.sh +├── 02-run_bedtools_intersect.tmp.sh +├── 03-format_for_ggplot.pl +├── 04-plot_splicing_across_functional_sites.R +├── 05-plot-splice-patterns.R +├── README.md +├── input +│   ├── CLK1-rmats.tsv +│   ├── gene_lists.tsv +│   ├── unipDisulfBond.hg38.col.bed +│   ├── unipDomain.hg38.col.bed +│   ├── unipLocSignal.hg38.col.bed +│   ├── unipMod.hg38.col.bed +│   └── unipOther.hg38.col.bed +├── plots +│   ├── dPSI_across_functional_sites.HGG.pdf +│   ├── dPSI_across_functional_sites_kinase.HGG.pdf +│   ├── kinases-ora-plot.pdf +│   └── splicing_pattern_plot.pdf +├── results +│   ├── kinases-functional_sites.tsv +│   ├── splice_events.diff.SE.txt +│   ├── splicing_events.SE.total.neg.intersectunip.ggplot.txt +│   └── splicing_events.SE.total.pos.intersectunip.ggplot.txt +├── run_module.sh +└── scr From 49c54b890c69c071a1dfda23e62c4ca6af3784e0 Mon Sep 17 00:00:00 2001 From: "Ammar S. Naqvi" Date: Tue, 6 Aug 2024 14:03:52 -0400 Subject: [PATCH 5/6] polyA vs stranded readme --- analyses/stranded-polyA-assessment/README.md | 14 ++++++++++++-- 1 file changed, 12 insertions(+), 2 deletions(-) diff --git a/analyses/stranded-polyA-assessment/README.md b/analyses/stranded-polyA-assessment/README.md index 6e9d4c0b6..c32d9db56 100644 --- a/analyses/stranded-polyA-assessment/README.md +++ b/analyses/stranded-polyA-assessment/README.md @@ -5,7 +5,7 @@ This module correlates polyA and stranded PSIs from patient samples known to hav ## Usage `bash run_module` -## Folder content +## Folder content `01-plot-str-vs-polyA.Rmd` runs correlation analysis on samples with both stranded and polyA RNA-seq and generates scatter plots. ``` @@ -17,4 +17,14 @@ This module correlates polyA and stranded PSIs from patient samples known to hav │   ├── PT_RYMG3M91_polyA_v_stranded_psi.pdf │   └── PT_W5GP3F6B_polyA_v_stranded_psi.pdf └── run_module.sh -``` \ No newline at end of file +``` + +## Directory structure +. +├── 01-plot-str-vs-polyA.Rmd +├── 01-plot-str-vs-polyA.html +├── README.md +├── plots +│   ├── PT_RYMG3M91_polyA_v_stranded_psi.pdf +│   └── PT_W5GP3F6B_polyA_v_stranded_psi.pdf +└── run_module.sh From 483dc4c4699639d528e8ccda7e8889769724eaf2 Mon Sep 17 00:00:00 2001 From: "Ammar S. Naqvi" Date: Tue, 6 Aug 2024 14:05:56 -0400 Subject: [PATCH 6/6] rm file --- analyses/README.md | 20 -------------------- 1 file changed, 20 deletions(-) delete mode 100644 analyses/README.md diff --git a/analyses/README.md b/analyses/README.md deleted file mode 100644 index a78529d2f..000000000 --- a/analyses/README.md +++ /dev/null @@ -1,20 +0,0 @@ -### Get rMATS data files -
**Run shell script to get merged rMATS result tables** -``` -./download_data.sh -``` -## Analysis Modules -This directory contains various analysis modules in the pbta-splicing-hgat project. -See the README of an individual analysis modules for more information about that module. - -### Modules at a glance -The table below is intended to help project organizers quickly get an idea of what files (and therefore types of data) are consumed by each analysis module, what the module does, and what output files it produces that can be consumed by other analysis modules. -This is in service of documenting interdependent analyses. - -Note that _nearly all_ modules use the harmonized clinical data file (`pbta-histologies.tsv`) even when it is not explicitly included in the table below. - -| Module |Brief Description | -|--------|------------------| -| [`psi_clustering`](https://github.com/d3b-center/pbta-splicing/tree/main/analyses/psi_clustering) | Consensus clustering of tumor splicing quantifications -| [`splicing burden index`](https://github.com/d3b-center/pbta-splicing/tree/main/analyses/splicing_index) | Compute splicing burden indices for each tumor sample across pediatric brain histologies -| [`splicing_events_functional_sites`](https://github.com/d3b-center/pbta-splicing/tree/main/analyses/splicing_events_functional_sites) | Identify signficiant aberrant splicing events that result in loss/gain of functional sites