Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

make-gnotate #44

Closed
jeffverboon opened this issue Oct 21, 2019 · 9 comments
Closed

make-gnotate #44

jeffverboon opened this issue Oct 21, 2019 · 9 comments

Comments

@jeffverboon
Copy link

jeffverboon commented Oct 21, 2019

Hi Brent,

I can never seem to get gnotate files that I make myself to work I always get this error:

[slivar] 349 samples matched in VCF and PED to be evaluated
[slivar] error opening ../../burden/gnomad.v3.genomes.zip for annotation
[slivar] failed to open gnotate file. please check path```

From this code:
```$slivar expr   --gnotate ../../burden/gnomad.v3.genomes.zip   --out-vcf $bcf   -v $vcf_out```

The path is not the issue:

Archive:  ../../burden/gnomad.v3.genomes.zip
Zip file size: 5804417638 bytes, number of entries: 196
-rw----     0.0 fat 431614664 bl defN 19-Oct-19 03:06 sli.var/1/gnotate-variant.
-rw----     0.0 fat 175106697 bl defN 19-Oct-19 03:06 sli.var/1/long-alleles.txt
-rw----     0.0 fat 215807332 bl defN 19-Oct-19 03:06 sli.var/1/gnotate-gnomad_A
-rw----     0.0 fat 215807332 bl defN 19-Oct-19 03:06 sli.var/1/gnotate-gnomad_A
-rw----     0.0 fat 215807332 bl defN 19-Oct-19 03:06 sli.var/1/gnotate-gnomad_A
-rw----     0.0 fat 215807332 bl defN 19-Oct-19 03:06 sli.var/1/gnotate-gnomad_A
-rw----     0.0 fat 215807332 bl defN 19-Oct-19 03:06 sli.var/1/gnotate-gnomad_A
-rw----     0.0 fat 215807332 bl defN 19-Oct-19 03:06 sli.var/1/gnotate-gnomad_A
-rw----     0.0 fat 458608368 bl defN 19-Oct-19 03:52 sli.var/2/gnotate-variant.
-rw----     0.0 fat 176878802 bl defN 19-Oct-19 03:51 sli.var/2/long-alleles.txt
-rw----     0.0 fat 229304184 bl defN 19-Oct-19 03:52 sli.var/2/gnotate-gnomad_A
...

And this is the code I used to make the gnotation file
```$slivar make-gnotate \
	--field AC:gnomad_AC \
	--field AN:gnomad_AN \
	--field AC_eas:gnomad_AC_eas \
	--field AN_eas:gnomad_AN_eas \
	--field AC_sas:gnomad_AC_sas \
	--field AN_sas:gnomad_AN_sas \
	--prefix gnomad.v3.genomes \
	gnomad.genomes.r3.0.sites.vcf.gz```

Also, the make-gnotate did make a few .bin files which I did not expect. 

-bash:uger-c011:/broad/hptmp/jverboon/Thai_burden/code 1060 $  ls -lah ../../burden/gnomad.v3.genomes.*
-rw-rw-r-- 1 jverboon root 5.1M Oct 19 11:32 ../../burden/gnomad.v3.genomes.gnotate-gnomad_AC.bin
-rw-rw-r-- 1 jverboon root 5.1M Oct 19 11:32 ../../burden/gnomad.v3.genomes.gnotate-gnomad_AC_eas.bin
-rw-rw-r-- 1 jverboon root 5.1M Oct 19 11:32 ../../burden/gnomad.v3.genomes.gnotate-gnomad_AC_sas.bin
-rw-rw-r-- 1 jverboon root 5.1M Oct 19 11:32 ../../burden/gnomad.v3.genomes.gnotate-gnomad_AN.bin
-rw-rw-r-- 1 jverboon root 5.1M Oct 19 11:32 ../../burden/gnomad.v3.genomes.gnotate-gnomad_AN_eas.bin
-rw-rw-r-- 1 jverboon root 5.1M Oct 19 11:32 ../../burden/gnomad.v3.genomes.gnotate-gnomad_AN_sas.bin
-rw-rw-r-- 1 jverboon root 5.5G Oct 19 11:32 ../../burden/gnomad.v3.genomes.zip


Thanks!
@brentp
Copy link
Owner

brentp commented Oct 21, 2019

hi, thanks for reporting. there was a limit on the size of the zip file of 4.2GB. I think this was fixed here
but I have not yet made a release. I'll make a zip for the gnomad genomes and verify all's well
and then make a new release.

@jeffverboon
Copy link
Author

Thanks!

@jeffverboon
Copy link
Author

I'm not sure that size is my issue though, as I had similar problems in the past making a gnotate from the clinvar vcf

@brentp
Copy link
Owner

brentp commented Oct 21, 2019

I see the (additional) problem. Thanks for updating.

@brentp
Copy link
Owner

brentp commented Oct 21, 2019

if it's the problem I am seeing, I have a fix in place, but you can get around it by indexing your input file before sending to gnotate. If that's not it, please let me know how to recreate.

I did this:

wget ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh37/archive_2.0/2019/clinvar_20191007.vcf.gz
slivar make-gnotate -f ALLELEID:clinvar_a clinvar_20191007.vcf.gz  --prefix clinvar
slivar expr -g clinvar.zip -v clinvar_20191007.vcf.gz  -o clinvar_self.vcf.gz

brentp added a commit that referenced this issue Oct 21, 2019
slivar would fail on VCFs without index and without contigs in the header
@jeffverboon
Copy link
Author

I can confirm that the code you posted above works for me as well so it may just be the size issue

@brentp
Copy link
Owner

brentp commented Oct 22, 2019

this is upstreamed here: nim-lang/zip#48

there was also a bug in slivar; when using gnotate with an input VCF that was neither indexed, nor had contigs in the header, it would error. that will be fixed in next release along with the 4.2 GB limit.

the above commands actually fail for me (until adding the index to the clinvar download).

@brentp
Copy link
Owner

brentp commented Nov 4, 2019

this is fixed in new release. i have tested fairly extensively, but would be good to get your verification that it works for your (previously broken) use-case.

@brentp brentp closed this as completed Nov 4, 2019
@jeffverboon
Copy link
Author

jeffverboon commented Nov 4, 2019 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants