Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Using SVision to detect variations in tandem repeats #29

Open
CSU-KangHu opened this issue Jan 28, 2024 · 1 comment
Open

Using SVision to detect variations in tandem repeats #29

CSU-KangHu opened this issue Jan 28, 2024 · 1 comment

Comments

@CSU-KangHu
Copy link

Hello @jiadong324,

Thank you for developing such an excellent tool. I have used SVision to detect tandem repeat variations in the HG002 sample with the GRCh38 reference. I employed the default command provided in the demo section:

SVision -o ${pathTo}/run_svision -b ${pathTo}/HG002.m84011_220902_175841_s1.GRCh38.chr1.bam -m ./svision_model/svision-cnn-model.ckpt -g ${pathTo}/GCA_000001405.15_GRCh38_no_alt_analysis_set.fasta -n HG002 -s 5 --graph --qname

Then, I used Truvari to evaluate the results with the following commands:

truvari bench -b ${pathTo}/HG002_GRCh38_TandemRepeats_v1.0.chr1.vcf.gz -c ${pathTo}/HG002.svision.s5.graph.sorted.vcf.gz --includebed ${pathTo}/HG002_GRCh38_TandemRepeats_v1.0.chr1.bed.gz --sizemin 5 --pick ac -o bench_result/
truvari refine --use-original-vcfs --reference ${pathTo}/hg38_chr1.fa bench_result/

Where the files HG002_GRCh38_TandemRepeats_v1.0.chr1.vcf.gz and HG002_GRCh38_TandemRepeats_v1.0.chr1.bed.gz are gold standard tandem repeat variation files and regions downloaded from https://ftp-trace.ncbi.nlm.nih.gov/ReferenceSamples/giab/release/AshkenazimTrio/HG002_NA24385_son/TandemRepeats_v1.0/GRCh38/.

The results are displayed in the figure below:
image

The recall is low. I have also checked some tandem repeat variation regions that were not detected by SVision. The IGV figure is shown below:
image

Many such regions were not detected by SVision. I wonder if SVision is not designed to detect tandem repeat variation regions?

I also performed the same test using TRGT, and the experimental results can be seen at ACEnglish/adotto#5.

@songbowang125
Copy link
Collaborator

Hi, I looked through your ground truth set (HG002_GRCh38_TandemRepeats_v1.0) and found that most of the variants were small variants, like indels smaller than 50bp. Also, the variant in your your example IGV was also a small variant. Since SVision detect (complex) structural variants rather than small variants.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants