-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
investigate lens and find out what tools can be integrated into version 2 #22
Comments
Investigate how it works with indels |
Currently we are looking at these two scripts and seeing what we are doing differently https://github.com/taylor-lab/neoantigen-dev/blob/master/neoantigen.py https://github.com/spvensko/lenstools/blob/main/lenstools.py |
Comparing how peptides are generated between lenstools and our neoantigen script, our script is a lot more elegant. Allison had asked me how we generate the peptides compared to lenstools, primarily asking if they translate out the entire peptide or not. Neither script retains the entire peptide for netMHC, both add some padding so additional aas are added to upstream and downstream of the indel. Lenstools uses this workflow: lenstools.nf (readme, called by: lens.nf-> neos.nf --> seq_variation.nf ) to generate var_tx_seqs. Var_tx_seqs are individual FA files PER MUTATION, generated from bcftools. Command: Actual generation of peptides is here: make_indel_peptides_context Taylorlab script pulls the cds and cdna sequences from the provided reference files. Then utilizing the maf's HGVSc column, it alters the cds sequence by either inserting the insertion sequence or removing the reference allele for deletions at the position stated. Once the WT and MUT peptide sequences are identified, script manually converts entire sequence to amino acid sequences. Script then compares WT to MUT amino acid sequences, identifying location where AAs do not match. Script will then return the altered amino acid (plus 10 aas before and after altered region). Result: Lenstools and taylorlab scripts both return peptides with a padding of 10 AAs. Primary difference appears to be flipping the sequence when the strand is negative, however I am unsure if that is due to the reference FASTA that lenstools inputs. It may not be a cds sequence as we are inputting. |
Discussed with mark, FA file is a genomic FASTA file, this is likely being done because prior to conversion of indel, germline variants are integrated. This may be something to look into further when we add germline variants to our pipeline. Will continue investigating LENStools |
No description provided.
The text was updated successfully, but these errors were encountered: