Primary T-cell screens project code

In this README, there will be information on each analysis code file which is used to analyze the data of the paper "linking candidate causal autoimmune variants to T-cell networks using genetic and epigenetic screens".

20240913_eliminate_SNP.Rmd

This .Rmd eliminates a SNP which appeared in the results which should not have been there because it was not even in the sequencing library.

20240913_mpra_merge_creation_FINAL.Rmd

This Rmd takes the basic MPRA information and contextualize it with linkage disequillirium, epigenetic and transcription factor binding data. This creates the expanded table which is called mpra merge.

20240913_mpra_hg_19_to_38_final.Rmd

After the MPRA merge table is created, I incorporated human genome liftover data to have seperate hg19, hg38 and hg19 and 38 tables. I have already done this and both columns appear in the final table so you don't need to do this agian.

20240916_mouri_et_al_replication_code_in_line.Rmd

This .Rmd contains code to replicate the MPRA analysis code for the previously published Jurkat T-cell cell line data (from Mouri, K., Guo, M.H., de Boer, C.G. et al. Prioritization of autoimmune disease-associated genetic variants that perturb regulatory element activity in T cells. Nat Genet 54, 603–612 (2022).) as well as some plots of our own. The analyses include:

Comparisons between the Jurkat and primary T-cell data using venn diagrams, tables and plotting allelic bias

Enrichments for epigenetic data including DHS, ATAC-seq, caQTL, histone marks, etc.

An plot describing the enrichment for MPRA emVars for PICS fine-mapping variants

An initial motifbreakR analysis simply describing the enrichment for transcription factor binding sites)

20240820_DHS_precision_recall_grid.Rmd

This Rmd contains the grid search which is used to estimate the cut-offs for high activity variants (p-CREs) and allelic-specific expression variants (emVars).

20240913_motifbeakr_enrichment_analysis_FINAL.Rmd

This Rmd contains the transcription factor binding analysis of the MPRA data. The steps to this analysis include:

Create a Granges bed file of the variants mpra tested in the MPRA
Run motifbreakR function to generate TF binding data on the MPRA variants
Merge motifbreakr and MPRA data
Run t-test of primary T cell MPRA expression of variants which do and do not bind to each tf
Repeat step 4 with jurkat mpra expression data
Run t-test analysis for variants fine mapped to each disease
Merge the primary tcell and unstimulated jurkat data
Compare the results of jurkat and primary T cells

20240914_tf_columns_mpra_merge.Rmd

After creating the TF data in the previous .Rmd, this .Rmd created the columns which are used in MPRA merge. This .Rmd incorporates data two TF binding site programs, motifbreakR and Ananastra.

20240914_UK_biobank_finemapping_enrichment.Rmd

This Rmd contains the enrichments for MPRA emVars for variants fine-mapped in UK BioBank (UKBB) fine-mapping data. The steps to this analysis include:

Import the UKBB data and merge with MPRA data.

Create a table with MPRA variants and the UKBB data for the paper.

Create the enrichment plots for MPRA emVars in UKBB data.

20240913_mpra_supplementary_tables.Rmd

Finally using all the tables which are relevant to the MPRA data created so far, I put the tables into the final format which appears in the paper. Here are all the tables created in this file:

NOT THE ORDER IN THE ACTUAL SUPPLEMENTARY TABLES

Tcell MPRA results

Jurkat MPRA results

PICS enrichment all loci

PICS enrichment emvars loci

UK biobank enrichment all loci

UK biobank enrichment emvars loci

tcell motifbreakr mpra combined

tcell motifbreakr logskew ttest

jurkat motifbreakr mpra combined

jurkat motifbreakr logskew ttest

ChromHMM enrich

Histone CAGE DHS enr

T cell MPRA functional annotations

PICS by MPRA

UKBB by MPRA

Jurkat MPRA functional annotations

Encode DHS Enrichment

T-cell DHS Grid Search

Jurkat DHS Grid Search

Tcell TF ttest by disease

Raylab Analysis V2G Jupyter Notebook

This Jupyter notebook generates variant-to-gene (V2G) mapping for rsIDs of interest. Key steps include:

Converting rsIDs to variant IDs using genopyc Mapping variants to genes with the V2G otargen pipeline Processing T cell expression data from the DICE database Filtering V2G output based on cell-specific expression Creating background and foreground datasets for network analysis

Requires Python (pandas, genopyc, polars) and R (otargen, purrr, dplyr, readr) libraries. Outputs include filtered V2G data and gene sets for further analysis.

CRISPR_screen_analysis.Rmd

This markdown file uses Seurat and SCEPTRE to analyze single-cell CRISPR screen data.

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
20240913_eliminate_SNP.Rmd		20240913_eliminate_SNP.Rmd
20240913_mpra_hg_19_to_38_final.Rmd		20240913_mpra_hg_19_to_38_final.Rmd
20240913_mpra_merge_creation_FINAL.Rmd		20240913_mpra_merge_creation_FINAL.Rmd
20240914_tf_columns_mpra_merge.Rmd		20240914_tf_columns_mpra_merge.Rmd
20241003_UK_biobank_finemapping_enrichment.Rmd		20241003_UK_biobank_finemapping_enrichment.Rmd
20241003_motifbeakr_enrichment_analysis_FINAL.Rmd		20241003_motifbeakr_enrichment_analysis_FINAL.Rmd
20241003_mouri_et_al_replication_code_in_line.Rmd		20241003_mouri_et_al_replication_code_in_line.Rmd
20241003_mpra_DHS_grid_search_FINAL.Rmd		20241003_mpra_DHS_grid_search_FINAL.Rmd
20241003_mpra_supplementary_tables.Rmd		20241003_mpra_supplementary_tables.Rmd
CRISPR_screen_analysis.Rmd		CRISPR_screen_analysis.Rmd
README.md		README.md
Raylab_analysis_v2g_codecombinedforpaper_09142024.ipynb		Raylab_analysis_v2g_codecombinedforpaper_09142024.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Primary T-cell screens project code

20240913_eliminate_SNP.Rmd

20240913_mpra_merge_creation_FINAL.Rmd

20240913_mpra_hg_19_to_38_final.Rmd

20240916_mouri_et_al_replication_code_in_line.Rmd

20240820_DHS_precision_recall_grid.Rmd

20240913_motifbeakr_enrichment_analysis_FINAL.Rmd

20240914_tf_columns_mpra_merge.Rmd

20240914_UK_biobank_finemapping_enrichment.Rmd

20240913_mpra_supplementary_tables.Rmd

Raylab Analysis V2G Jupyter Notebook

CRISPR_screen_analysis.Rmd

About

Releases

Packages

Contributors 3

Languages

BenaroyaResearch/primary_T_MPRA

Folders and files

Latest commit

History

Repository files navigation

Primary T-cell screens project code

20240913_eliminate_SNP.Rmd

20240913_mpra_merge_creation_FINAL.Rmd

20240913_mpra_hg_19_to_38_final.Rmd

20240916_mouri_et_al_replication_code_in_line.Rmd

20240820_DHS_precision_recall_grid.Rmd

20240913_motifbeakr_enrichment_analysis_FINAL.Rmd

20240914_tf_columns_mpra_merge.Rmd

20240914_UK_biobank_finemapping_enrichment.Rmd

20240913_mpra_supplementary_tables.Rmd

Raylab Analysis V2G Jupyter Notebook

CRISPR_screen_analysis.Rmd

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages