Bender et al 2024 EMBO

Code documentation for our paper Bender et al (2024) EMBO, Redistribution of PU.1 partner transcription factor binding secures cell survival in leukemogenesis.

Content

00_packageVersions.Rmd Lists all R package and command line tool versions used in the Docker image we used for all analysis.
See rendered code here.
01_dataRetrieval.Rmd Uses biomaRt to fetch a lookup table of human-to-mouse orthologs necessary for some analysis scripts.
See rendered code here.
02_scRNAseq.Rmd All scRNA-seq analysis. Covers Figures 1A, 1B, 1C, 1D, 1E, 1F, 2A, 2B, 2C, Appendix Figure S1B/C/D/E/F/G/H/I/J and S4B.
See rendered code here.
03_RNAseq_Celllines.Rmd Bulk RNA-seq of cell lines. Covers Appendix Figure S2D.
See rendered code here.
04_shRNAscreen.Rmd The shRNA screen analysis. Covers Figures 3A, 3B, 3C and Appendix Figure S3B.
See rendered code here.
05_BeatAMLcohort.Rmd Integration of the BeatAML cohort. Covers Figure 4A and 4B.
See rendered code here.
06_proteome.Rmd Proteome analysis from Hox cell lines. Covers Figures 2D, 2E and 5A.
See rendered code here.
07_atacseq.Rmd ATAC-seq analysis and motif search. Covers Figures 6A-D, and 6I/J and data generation for manually drawn Figures 6E-H.
See rendered code here.
08_chipseq.Rmd RUNX1 ChIP-seq. Covers Figures 7A-E and Appendix Figure S5A/B.
See rendered code here.
09_other.Rmd The autohagy genes from RNA-seq in ex vivo murine cells and the low-throughput experiment analysis. Covers Figure 3D, 4C, 5B-H, 7F/G, Appendix Figure S3F.
See rendered code here.

Run Code

Follow these steps to reproduce the figures:

Get source data

Download all preprocessed OMICS data (counts and metadata) from GEO, the numeric source data submitted to EMBO, and the proteome dataset which we provide via this repository, since the one submitted to EMBO underwent some Excel-ish gene name to date conversion during their publication preparations.

mkdir -p source_data && cd source_data

# scRNA-seq
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE250nnn/GSE250629/suppl/GSE250629%5Fscrnaseq%5FrawCounts%5Funfiltered.mtx.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE250nnn/GSE250629/suppl/GSE250629%5Fscrnaseq%5Fcoldata%5Funfiltered.tsv.gz
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE250nnn/GSE250629/suppl/GSE250629%5Fscrnaseq%5Frowdata%5Funfiltered.tsv.gz

# bulk RNA-seq from cell lines (don't be confused, B22 is the URE-AML cells, I forgot to relabel this before GEO submission)
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE250nnn/GSE250620/suppl/GSE250620%5Frnaseq%5Fcelllines%5FrawCounts.tsv.gz

# bulk RNA-seq from ex vivo cells
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE250nnn/GSE250622/suppl/GSE250622%5Frnaseq%5FexVivo%5FrawCounts.tsv.gz

# shRNA screen
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE250nnn/GSE250630/suppl/GSE250630%5Fshrna%5Fscreen%5FrawCounts.tsv.gz

# ATAC-seq
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE250nnn/GSE250619/suppl/GSE250619%5Fatacseq%5FrawCounts.tsv.gz

# ChIP-seq
wget https://ftp.ncbi.nlm.nih.gov/geo/series/GSE251nnn/GSE251672/suppl/GSE251672%5Fchipseq%5Frunx1%5FrawCounts.tsv.gz

# Source data fo Figures 3,4,5,7 and the appendix figures
wget https://www.ebi.ac.uk/biostudies/files/S-SCDT-10_1038-S44318-024-00295-Y/Source_Data_Figure_3.zip
wget https://www.ebi.ac.uk/biostudies/files/S-SCDT-10_1038-S44318-024-00295-Y/Source_Data_Figure_4.zip
wget https://www.ebi.ac.uk/biostudies/files/S-SCDT-10_1038-S44318-024-00295-Y/Source_Data_Figure_5.zip
wget https://www.ebi.ac.uk/biostudies/files/S-SCDT-10_1038-S44318-024-00295-Y/Source_Data_Figure_7.zip
wget https://www.ebi.ac.uk/biostudies/files/S-SCDT-10_1038-S44318-024-00295-Y/Source_Data_Appendix.zip

# Unzip and only keep unzipped data
ls Source_Data_*.zip | while read p; do unzip ${p%.zip}; done
rm Source_Data_*.zip

# Proteome
wget https://github.com/ATpoint/bender_et_al_2024/raw/refs/heads/main/source_data/Dataset_EV_7.xlsx

# ChIP-seq peaks from published reanalyzed datasets. Code for how it was created it in the preprocessing documentation.
wget https://github.com/ATpoint/bender_et_al_2024/raw/refs/heads/main/source_data/LSK_PU1_IDR.txt.gz
wget https://github.com/ATpoint/bender_et_al_2024/raw/refs/heads/main/source_data/GMP_PU1_IDR.txt.gz
wget https://github.com/ATpoint/bender_et_al_2024/raw/refs/heads/main/source_data/GMP_CEBPA_IDR.txt.gz

# The tx2gene map that also contains the genomic coordinates of all TSS, made from the mouse GENCODE GTF file from version vM25 (Ensembl v100)
wget https://github.com/ATpoint/bender_et_al_2024/raw/refs/heads/main/source_data/tx2gene.txt.gz

# The subset of all relevant bigwig files for Figure 7D as RDS file, because the actual bigwigs are too big for easy sharing. 
# Email me if you need the bigwigs, I can provide.
wget https://github.com/ATpoint/bender_et_al_2024/raw/refs/heads/main/source_data/bigwig_signals.rds.xz

Run the Rmarkdown documents

We provide a Docker image that contains the exact software versions (R and command line applications) we used for analysis, allowing reproduction of the figures. Given that Docker is installed and in PATH, run this in your terminal:

Here, DIR is the full path to the directory with the folder source_data created above and the Rmarkdown and R scripts from the repository:

DIR="/home/atpoint/bender_et_al_2024/"
IMAGE="atpoint/phd_project:1.9.5"
docker pull "$IMAGE" # takes some time, it's a big one due to legacy burden over many years...
docker run -d -p 8787:8787 -v "${DIR}":/projectdir -e PASSWORD=aVeryComplexPassword -e ROOT=TRUE -e IMAGE="$IMAGE" "$IMAGE"

Then type localhost:8787 into your web browser with the username "rstudio" and password as above to access the interactive RStudio server session. Be sure to render the code in chronological order since some scripts depend on output from scripts run before.

If not, or if you have any details questions, feel free to email me at a.bender<guesswhat>uni-muenster.de or open an issue.

Name		Name	Last commit message	Last commit date
Latest commit History 24 Commits
source_data		source_data
.gitattributes		.gitattributes
00_packageVersions.Rmd		00_packageVersions.Rmd
00_packageVersions.html		00_packageVersions.html
01_dataRetrieval.Rmd		01_dataRetrieval.Rmd
01_dataRetrieval.html		01_dataRetrieval.html
02_scRNAseq.Rmd		02_scRNAseq.Rmd
02_scRNAseq.html		02_scRNAseq.html
03_RNAseq_Celllines.Rmd		03_RNAseq_Celllines.Rmd
03_RNAseq_Celllines.html		03_RNAseq_Celllines.html
04_shRNAscreen.Rmd		04_shRNAscreen.Rmd
04_shRNAscreen.html		04_shRNAscreen.html
05_BeatAMLcohort.Rmd		05_BeatAMLcohort.Rmd
05_BeatAMLcohort.html		05_BeatAMLcohort.html
06_proteome.Rmd		06_proteome.Rmd
06_proteome.html		06_proteome.html
07_atacseq.Rmd		07_atacseq.Rmd
07_atacseq.html		07_atacseq.html
08_chipseq.Rmd		08_chipseq.Rmd
08_chipseq.html		08_chipseq.html
09_other.Rmd		09_other.Rmd
09_other.html		09_other.html
README.md		README.md
functions.R		functions.R
runStartup.R		runStartup.R

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Bender et al 2024 EMBO

Content

Run Code

About

Releases

Packages

Languages

ATpoint/bender_et_al_2024

Folders and files

Latest commit

History

Repository files navigation

Bender et al 2024 EMBO

Content

Run Code

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages