Skip to content

Commit

Permalink
Update README.md
Browse files Browse the repository at this point in the history
  • Loading branch information
aadamk committed Sep 18, 2024
1 parent f2e0c5c commit 780d9b2
Showing 1 changed file with 2 additions and 117 deletions.
119 changes: 2 additions & 117 deletions analyses/methylation_analysis/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,7 @@ c. gene body + promoter
../intNMF/results/intnmf_clusters.tsv
# methylation annotation file
../../data/v12/infinium.gencode.v39.probe.annotations.tsv.gz
../../data/v15/infinium.gencode.v39.probe.annotations.tsv.gz
# methylation m-values subsetted to cohort of interest
../data_preparation/data
Expand Down Expand Up @@ -100,7 +100,7 @@ c. gene body + promoter
../intNMF/results/intnmf_clusters.tsv
# methylation annotation file
../../data/v12/infinium.gencode.v39.probe.annotations.tsv.gz
../../data/v15/infinium.gencode.v39.probe.annotations.tsv.gz
# methylation m-values subsetted to cohort of interest
../data_preparation/data
Expand Down Expand Up @@ -137,118 +137,3 @@ plots
└── promoter_gsaregion_pathways.pdf
```

### Differential methylated region analysis (DMRcate) + Pathway Enrichment (fgsea)

`05-dmr_fgsea_analysis.R`: The function of this script is to pull the full medulloblastoma methylation m-values dataset and filter the dataset into probes representing promoter region and gene-body (introns and exons). Next, differential region-level methylation analyses is performed with `DMRcate::dmrcate` using a cluster-of-interest vs 'rest' approach, and specifying 'EPIC' array. Finally, `fgsea::fgsea` analysis is performed on the `genes overlapping` the resulting differentially methylated regions to determine cluster-specific pathway differences.

Tests of interest:
a. gene body only
b. promoter only
c. gene body + promoter

```
# intNMF derived clusters
../intNMF/results/intnmf_clusters.tsv
# methylation annotation file
../../data/v12/infinium.gencode.v39.probe.annotations.tsv.gz
# methylation m-values subsetted to cohort of interest
../data_preparation/data
└── methyl-m-values.rds
```

TSV file of all significant pathways (`FDR < 0.05`) for each cluster:

```
results
└── dmr_fgsea_output
├── hallmark
│ ├── gene_body_fgsea_output_per_cluster.tsv
│ ├── genebody_promoter_fgsea_output_per_cluster.tsv
│ └── promoter_fgsea_output_per_cluster.tsv
└── reactome
├── gene_body_fgsea_output_per_cluster.tsv
├── genebody_promoter_fgsea_output_per_cluster.tsv
└── promoter_fgsea_output_per_cluster.tsv
```

Barplots of top 50 pathways (`FDR < 0.05`) enriched in each cluster:
```
plots
└── dmr_fgsea_output
├── hallmark
│ ├── gene_body_fgsea_pathways.pdf
│ ├── genebody_promoter_fgsea_pathways.pdf
│ └── promoter_fgsea_pathways.pdf
└── reactome
├── gene_body_fgsea_pathways.pdf
├── genebody_promoter_fgsea_pathways.pdf
└── promoter_fgsea_pathways.pdf
```

### Integration of MethReg output with DIABLO

`06-methreg_diablo_integration.R`: The function of this script is to utilize triplet information from MethReg analysis (i.e. prioritization of differentially methylated CpG sites) and compute an intersection between DIABLO results (with non-zero loadings) and the functional probes that pass the threshold of `RLM_DNAmGroup:TF_fdr` or `RLM_DNAmGroup_fdr` < 0.05.

#### Inputs

```
# formatted methylation matrix for MethReg analysis
results/methreg_output
└── methylation_matrix_methreg.rds
# formatted expression matrix for MethReg analysis
results/methreg_output
└── expression_matrix_methreg.rds
# output of functionally relevant triplets
results/methreg_output
└── triplet_nearest_gene_interactions_stratified_model.tsv
# output of DIABLO descriptive analysis
../diablo/results/intnmf/descriptive
└── diablo_MB.rds
```

#### Outputs

```
# tsv file of triplets intersecting with DIABLO
results
└── methreg_output
└── diablo_integration
└── comp{component_number}_diablo_methreg_intersection.tsv
# interaction model for probes intersecting with DIABLO
plots
└── methreg_output
└── diablo_integration
└── comp{component_number}_diablo_methreg_intersection.pdf
```

### Integration of Limma output with DIABLO

`06-limma_diablo_integration.R`: The function of this script is to utilize differentially expressed CpG sites obtained from `01-limma_analysis.R` and compute an intersection with DIABLO results (with non-zero loadings).

#### Inputs

```
# output of differentially expressed CpG sites
results/limma_output
└── genebody_promoter_diffexpr_probes_per_cluster.tsv
# output of DIABLO descriptive analysis
../diablo/results/intnmf/descriptive
└── diablo_MB.rds
```

#### Outputs

```
# tsv file of differential CpGs intersecting with DIABLO
results
└── limma_output
└── diablo_integration
└── comp{component_number}_diablo_limma_intersection.tsv
```

0 comments on commit 780d9b2

Please sign in to comment.