Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error about dbsnp #22

Open
solivehong opened this issue Jan 24, 2019 · 6 comments
Open

Error about dbsnp #22

solivehong opened this issue Jan 24, 2019 · 6 comments

Comments

@solivehong
Copy link

I have 13 lung cancer data, some of which are not normal. I hope you can use this software to find somatic mutations.
I ran this program when the following error occurred, I did not understand the readme instructions need to re-download dbsno142 or use your dbsnp download link,

perl ${ISOWN_HOME}/bin/database_annotation.pl PM2018122708.sofstric.watson.vcf PM2018122708.sofstric.watson.vcf.test.vcf

annotating input file with ANNOVAR ...NOTICE: Output files were written to PM2018122708.sofstric.watson.vcf.test.vcf.temp.annovar.vcf.temp.convert2annovar.variant_function, PM2018122708.sofstric.watson.vcf.test.vcf.temp.annovar.vcf.temp.convert2annovar.exonic_variant_function
NOTICE: Reading gene annotation from /gpfs/home//software/ISOWN/bin/../external_tools/annovar_2012-03-08/humandb/hg19_refGene.txt ... Done with 52068 transcripts (including 11837 without coding sequence annotation) for 26464 unique genes
NOTICE: Processing next batch with 2995 unique variants in 2995 input lines
NOTICE: Reading FASTA sequences from /gpfs/home/zhaohongqiang/software/ISOWN/bin/../external_tools/annovar_2012-03-08/humandb/hg19_refGeneMrna.fa ... Done with 1579 sequences
WARNING: A total of 356 sequences will be ignored due to lack of correct ORF annotation

The dbSNP 142 file is not found. Please correct the path in /gpfs/home/software/ISOWN//bin/database_annotation.pl and try again - see path below:

    /gpfs/home/software/ISOWN//bin/../external_databases/dbSNP142_All_20141124.vcf.gz.modified.vcf.gz
@solivehong
Copy link
Author

I use annovar directly to comment on the vcf run.
perl ${ISOWN_HOME}/bin/run_isown.pl test/RD2018080114.sofstric.watson.vcf.hg19_multianno.vcf test.output.txt " -trainingSet ${ISOWN_HOME}/training_data/BRCA_100_TrainSet.arff -sanityCheck false -classifier nbc"

Reformat files in '/gpfs/home/zhaohongqiang/software/ISOWN/test/RD2018080114.sofstric.watson.vcf.hg19_multianno.vcf' to emaf ...

Exception in thread "main" java.lang.NullPointerException
at com.Processing.processVcf(Processing.java:39)
at com.runReformating.main(runReformating.java:39)

Running prediction using file 'test.output.txt.emaf' ...

...
Your working directory is /gpfs/home/zhaohongqiang/software/ISOWN
...
This file was chosen for classifier training: /gpfs/home/zhaohongqiang/software/ISOWN//training_data/BRCA_100_TrainSet.arff
...
Exception in thread "main" java.lang.NullPointerException
at helper.Headers.(Headers.java:41)
at main.Prediction.getVariant2samples(Prediction.java:347)
at main.Prediction.loadVariants(Prediction.java:28)
at main.runISOWN.main(runISOWN.java:85)

Done

@solivehong
Copy link
Author

this is my annovar result
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT NORMAL TUMOR
chr1 12783 . G A . PASS WfG_variant_origin=somatic;ANNOVAR_DATE=2016-02-01;Func.refGene=ncRNA_intronic;Gene.refGene=DDX11L1;GeneDetail.refGene=.;ExonicFunc.refGene=.;AAChange.refGene=.;cosmic70=.;avsnp147=rs62635284;esp6500siv2_all=.;ExAC_ALL=.;ExAC_AFR=.;ExAC_AMR=.;ExAC_EAS=.;ExAC_FIN=.;ExAC_NFE=.;ExAC_OTH=.;ExAC_SAS=.;1000g2015aug_all=.;1000g2015aug_eas=.;CLINSIG=.;CLNDBN=.;CLNACC=.;CLNDSDB=.;CLNDSDBID=.;SIFT_score=.;SIFT_pred=.;Polyphen2_HDIV_score=.;Polyphen2_HDIV_pred=.;Polyphen2_HVAR_score=.;Polyphen2_HVAR_pred=.;LRT_score=.;LRT_pred=.;MutationTaster_score=.;MutationTaster_pred=.;MutationAssessor_score=.;MutationAssessor_pred=.;FATHMM_score=.;FATHMM_pred=.;PROVEAN_score=.;PROVEAN_pred=.;VEST3_score=.;CADD_raw=.;CADD_phred=.;DANN_score=.;fathmm-MKL_coding_score=.;fathmm-MKL_coding_pred=.;MetaSVM_score=.;MetaSVM_pred=.;MetaLR_score=.;MetaLR_pred=.;integrated_fitCons_score=.;integrated_confidence_value=.;GERP++_RS=.;phyloP7way_vertebrate=.;phyloP20way_mammalian=.;phastCons7way_vertebrate=.;phastCons20way_mammalian=.;SiPhy_29way_logOdds=.;ALLELE_END GT:DP:AF:VD:ALD 0/0:59:0.2712:16:7,9 0/1:59:0.2712:16:7,9

@ikalatskaya
Copy link
Owner

Hello,

did not understand the readme instructions need to re-download dbsno142 or use your dbsnp download link,
You have to download dbSNP file and reformat it.

In the INSTALLATION INSTRUCTIONS:
Download dbSNP from NCBI:
wget ftp://ftp.ncbi.nlm.nih.gov/snp/organisms/human_9606/VCF/00-All.vcf.gz --no-passive-ftp
Reformat and index dbSNP using the following script:
perl ${ISOWN_HOME}/bin/ncbi_dbSNP_format_index.pl 00-All.vcf.gz 00-All.modified.vcf

Vcf reformatting is most likely failing because the annotation is not completed.

Let me know if you have other issues.
Irina

@gprashant17
Copy link

I tried reformatting dbsnp file but the following error occured. What should i do ?

perl bin/ncbi_dbSNP_format_index.pl 00-All.vcf.gz dbSNP142_All_20141124.vcf.gz.modified.vcf.gz

Reformat 00-All.vcf.gz ...
gzip: 00-All.vcf.gz: invalid compressed data--crc error

gzip: 00-All.vcf.gz: invalid compressed data--length error

Compressing dbSNP142_All_20141124.vcf.gz.modified.vcf.gz ... [ti_index_core] the file out of order at line 10276162

Done

@ikalatskaya
Copy link
Owner

ikalatskaya commented Jun 28, 2019 via email

@peishimei
Copy link

I tried reformatting dbsnp file but the following error occured. What should i do ?

perl bin/ncbi_dbSNP_format_index.pl 00-All.vcf.gz dbSNP142_All_20141124.vcf.gz.modified.vcf.gz

Reformat 00-All.vcf.gz ...
gzip: 00-All.vcf.gz: invalid compressed data--crc error

gzip: 00-All.vcf.gz: invalid compressed data--length error

Compressing dbSNP142_All_20141124.vcf.gz.modified.vcf.gz ... [ti_index_core] the file out of order at line 10276162

Done

Hi
Try gunzip your dbsnp file before reformatting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants