Skip to content

Code to analyze deep sequencing files of COV107-23 combinatorial mutations, and ddG simulations of COV107-23 mutations

Notifications You must be signed in to change notification settings

nicwulab/COV107-23_fitness_landscape

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Analyzing the fitness landscape of COV107-23, a SARS-CoV-2 spike antibody

Code to analyze deep sequencing files of COV107-23 combinatorial mutations, and ddG simulations of COV107-23 mutations

Analysis of deep sequencing data following fluorescence-activated cell sorting of COV107-23 combinatorial mutants

Dependencies

Input files

Due to large size of fastq files, they cannot be uploaded to Github. Create a /fastq/ folder. Please download fastq files from NCBI, and transfer to fastq folder.

I. Calculate counts and fitness from fastq files

  1. Calculate read counts and fitness from fastq files. Create a /fastq/ folder to store all downloaded fastq files.

python scripts/COV107SHM_fq2fit.py

  1. Filter results

python scripts/COV107SHM_filter_result.py

  1. Plot correlation of expression fitness between two independent experimental replicates

Rscript scripts/COV107_filtered.R

Analysis of ddG simulations of COV107-23 mutations

Dependencies

Input files

I. Renumber PDB file to prepare for mutagenesis

  1. Using PyMOL and the pdb file (PDB: 7LKA), remove the solvent by running, in the PyMOL terminal

remove solvent

  1. After removing the solvent, only select chain A (antibody heavy chain) and chain B (antibody light chain) by running, in the PyMOL terminal

remove chain C+D+E+F+H+L

  1. Export the new molecule and retain atom IDs as COV107.pdb.

  2. Using pdb_reres.py in pdb-tools and the COV107.pdb file, run in the terminal

python pdb_reres.py COV107.pdb > COV107_renum.pdb

COV107_renum.pdb is now ready to be used as input for ddG prediction using Rosetta.

II. Predicting ddG using a modified high-resolution protocol of the ddG_monomer application in Rosetta

Link to ddG_monomer documentation: https://www.rosettacommons.org/docs/latest/application_documentation/analysis/ddg-monomer Instead of 50 iterations, only 30 iterations were performed.

  1. Pre-minimize the input structure COV107_renum.pdb

nohup /path/to/rosetta/main/source/bin/minimize_with_cst.static.linuxgccrelease -s /path/to/COV107_renum.pdb -in:file:fullatom -ignore_zero_occupancy false -ignore_unrecognized_res -fa_max_dis 9.0 -database /path/to/rosetta/main/database/ -ddg::harmonic_ca_tether 0.5 -score:weights /path/to/rosetta/main/database/scoring/weights/pre_talaris_2013_standard.wts -restore_pre_talaris_2013_behavior -ddg::constraint_weight 1.0 -ddg::out_pdb_prefix min_cst_0.5 -ddg::sc_min_only false -score:patch /path/to/rosetta/main/database/scoring/weights/score12.wts_patch > mincst.log 2>&1 </dev/null &

  1. Convert the .log file to a .cst file

bash /path/to/rosetta/main/source/src/apps/public/ddg/convert_to_cst_file.sh ./mincst.log > ./Constraint.cst

  1. Perform ddG prediction in the background. Perform 3 independent replicates.

nohup /path/to/rosetta/main/source/bin/ddg_monomer.static.linuxgccrelease -in:file:s /path/to/min_cst_0.5.COV107_renum_0001.pdb -ignore_zero_occupancy false -resfile F27I.resfile -ddg:weight_file soft_rep_design -ddg:minimization_scorefunction /path/to/rosetta/main/database/scoring/weights/pre_talaris_2013_standard.wts -restore_pre_talaris_2013_behavior -ddg::minimization_patch /path/to/rosetta/main/database/scoring/weights/score12.wts_patch -database /path/to/rosetta/main/database/ -fa_max_dis 9.0 -ddg::iterations 30 -ddg::dump_pdbs true -ignore_unrecognized_res -ddg::local_opt_only false -ddg::min_cst true -constraints::cst_file /path/to/Constraint.cst -ddg::suppress_checkpointing true -in::file::fullatom -ddg::mean false -ddg::min true -ddg::sc_min_only false -ddg::ramp_repulsive true -unmute core.optimization.LineMinimizer -ddg::output_silent false -out:path:all /path/to/F27I_rep1/ 2>&1 </dev/null &

  1. Compile total scores for all mutations.
  • Input file
    • Scores from ddg_predictions.out
  • Output file

About

Code to analyze deep sequencing files of COV107-23 combinatorial mutations, and ddG simulations of COV107-23 mutations

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published