Empty .dict files #7

OliverPStuart · 2021-02-07T23:49:00Z

I've been trying to run vargeno on non-human data and running into problems at the indexing stage. No error is reported during the process, but the .dict files are both empty, and so the genotyping step fails.

I'm working with a fragmentary reference assembly of a grasshopper genome, so both the bioinformatic and biological properties of the data are not at all what vargeno was designed for.

Do you have any tips for troubleshooting? Attached (here) is a sample of the .vcf input. Since my data is not human data and I'm obviously not working with dbSNP it's a little unclear how to properly format this file. Variants were detected with freebayes in the first instance.

Here is the terminal output:

$ vargeno index packardii.sub.fa snp.vcf test
[BloomFilter constructBfFromGenomeseq] bit vector: 755356701/9600000000
[BloomFilter constructBfFromGenomeseq] lite bit vector: 988176227/18400000000
[BloomFilter constructBfFromVCF] bit vector: 0/1120000000
SNP Dictionary
Total k-mers:        21626752
Unambig k-mers:      20575340
Ambig unique k-mers: 296062
Ambig total k-mers:  1051412
Ref Dictionary
Total k-mers:        1305711431
Unambig k-mers:      1130124620
Ambig unique k-mers: 36489256
Ambig total k-mers:  175586811

And here are the output files:

-rw-r--r--  1 oliver users   12348187 Feb  5 11:42 test.chrlens
-rw-r--r--  1 oliver users 1200000008 Feb  5 10:43 test.ref.bf
-rw-r--r--  1 oliver users 2300000008 Feb  5 10:43 test.ref.bf.lite.bf
-rw-r--r--  1 oliver users          0 Feb  5 14:47 test.ref.dict
-rw-r--r--  1 oliver users  140000008 Feb  5 11:41 test.snp.bf
-rw-r--r--  1 oliver users          0 Feb  5 11:42 test.snp.dict

All of the test files (in /vargeno/test) run fine and reproduce the provided output files. I'm running on Ubuntu 18.04.5 in a conda environment with the following packages:

# packages in environment at /home/oliver/miniconda2/envs/vargeno:
#
# Name                    Version                   Build  Channel
_libgcc_mutex             0.1                 conda_forge    conda-forge
_openmp_mutex             4.5                       1_gnu    conda-forge
bioawk                    1.0                  hed695b0_5    bioconda
libgcc-ng                 9.3.0               h2828fa1_18    conda-forge
libgomp                   9.3.0               h2828fa1_18    conda-forge
libstdcxx-ng              9.3.0               h6de172a_18    conda-forge
seqtk                     1.3                  hed695b0_2    bioconda
vargeno                   1.0.3                hc9558a2_1    bioconda
zlib                      1.2.11            h516909a_1010    conda-forge

The text was updated successfully, but these errors were encountered:

ldirocco mentioned this issue Apr 29, 2021

Indexing problem #8

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Empty .dict files #7

Empty .dict files #7

OliverPStuart commented Feb 7, 2021

Empty .dict files #7

Empty .dict files #7

Comments

OliverPStuart commented Feb 7, 2021