Scripts for the 10X multiome analysis (RNA + ATAC) in trisomy 21 and disomy foetal liver samples, as related to the following manuscript:
Marderstein, A.R. et al. (2024). Single-cell multi-omics map of human foetal blood in Down's Syndrome. Nature.
- SCAVENGE (includes the primary script for running SCAVENGE)
- SCENT (includes any scripts relevant to the SCENT analysis)
- MIRA (Python notebooks detailing the MIRA analysis led by Jon Bezney)
- etc (includes misc scripts)
- andrew is a folder containing draft scripts that is currently being kept for housekeeping purposes
- Introduction
- Scripts
- MIRA
- Motifs
- SCENT
- Preprocessing for SCENT Input
- Running SCENT - Bash Parallel Submission
- Running SCENT - R Parallelization
- Running SCENT - R Function
- Postprocessing of SCENT Output
- Peak-gene Links - Chromosomal Distribution (Histogram)
- Peak-gene Links - Number of Discoveries (Barplot)
- Accessibility-by-Trisomy Interaction - Number of Discoveries (Barplot)
- Summary Table
- Peak-gene Links + RBC GWAS Enrichments (Forest Plot)
- Misc Script for Analyses
- SCAVENGE
- Somatic
- Other
- QC-related
- Multiome Celltype Freq - Differential Abundance
- Multiome Celltype Freq - Comparison with scRNA
- GATA1 Accessibility
- Plot Peak Accessibility + Gene Expression Tracks for TFR2, TSPAN32
- Pseudobulk Expression of HSCs in Large scRNA-seq Data
- Intersect Peaks with Fine-Mapped GWAS SNPs
- TFR2 Overexpression Experiments (Statistical Analysis)
- Anndata to Seurat
- Etc
Input data for scripts are based on the datasets that have been deposited on ArrayExpress.
The following data has been deposited on ArrayExpress:
- scRNA-seq FASTQ raw data and CellRanger count matrices (accession number E-MTAB-13067)
- 10x Visium FASTQ raw data, SpaceRanger count matrices, run summary metrics, and spatiality outputs (E-MTAB-13062)
- Multiome snRNA-seq and snATAC-seq FASTQ raw data, CellRanger ARC count matrices, and ATAC fragment files (E-MTAB-13070).
You will need to install packages that are listed in the header of scripts prior to running them.
Can't find code relevant to the analysis that you are interested in? Please look here first:
- GitLab repository for spatial transcriptomics and other scRNA-seq analyses
- GitHub repository for 10X multiome analyses
Please reach me at [email protected] if there are questions about the analysis.
This folder contains 4 Jupyter notebooks that contain all of the analysis related to the MIRA and CELLRANK trajectories.
This folder contains scripts relevant to TF motif analyses.
- pre_scent.R: Uses assemble_parallel_files.R to create the SCENT input
- run_scent_wrapper.sh: Wrapper for running SCENT (using run_scent.R, which in turn uses SCENT.R)
- run_scent.R
- SCENT.R
- post_scent.R: Compiles all SCENT output into a single file
- peak_gene_summarize_plot.R
- multiome_t21_v_h_barplot.R
- interaction_barplot.R
- create_scent_supp_table.R
- peak_gene_enrichment_plot.R
- post_int.R
- SCAVENGE.R
- finemap_liftover_scavenge.R
- scavenge_statistical_analysis.R
- TRS_by_branch_plots.R
- hasaart_liftover.sh
- qc_metrics_supptab.R, qc_metrics.R
- multiome_prop.R
- multiome_scrna_comp.R
- GATA1_geneactivity.R
- plot_region_of_interest.R
- HSC_pb.R
- rbc_peaks_overlap.sh
- TFR2_experiment.R
- anndata_to_seurat.R
The etc directory contains multiple scripts related to analysis, such as somatic enrichments, differential analyses, and UMAPs.