Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Building Hash Table... never ends #235

Open
Innocent-L opened this issue Jan 7, 2025 · 1 comment
Open

Building Hash Table... never ends #235

Innocent-L opened this issue Jan 7, 2025 · 1 comment

Comments

@Innocent-L
Copy link

Hi,
I tried to use NOVO for the assembly of the primate mitochondrial genome,
I follow all the instructions, but it doesn't seem to work properly.
Can you help?
run command:
perl /home/zl/software/NOVOplasty-master/NOVOPlasty4.3.1.pl -c config.txt
config.txt:
Project:

Project name = FJ559
Type = mito
Genome Range = 15000-17000
K-mer = 30
Max memory =
Extended log = 0
Save assembled reads = yes
Seed Input = /home/zl/spider/zlbz/snp_mul10/download_cleandata/fj559/sequence_cytb.fasta
Extend seed directly = no
Reference sequence = /home/zl/spider/zlbz/snp_mul10/download_cleandata/fj559/sequence.fasta
Variance detection =

Dataset 1:

Read Length = 150
Insert size = 300
Platform = illumina
Single/Paired = PE
Combined reads =
Forward reads = /home/zl/spider/zlbz/snp_mul10/download_cleandata/SRR5019559_clean_1.fastq.gz
Reverse reads = /home/zl/spider/zlbz/snp_mul10/download_cleandata/SRR5019559_clean_2.fastq.gz
Store Hash =

Heteroplasmy:

MAF =
HP exclude list =
PCR-free =

Optional:

Insert size auto = yes
Use Quality Scores = no
Reduce ambigious N's =
Output path = /home/zl/spider/zlbz/snp_mul10/download_cleandata/fj559/

Project:

The output log has always been:
Reading Input......OK

Scan reference sequence......OK

Building Hash Table...
It's been 6 days
Thanks

@ndierckx
Copy link
Owner

ndierckx commented Jan 7, 2025

You probaply ran out of memory, most systems will kill the script, but seems some just keep it hanging
6 days is very long, should be between few minutes to few hours depending on the size.
If you have a machine with limited memory, best to use the max memory option, make sure you give a value bit below the max available on your machine. It will subsample the dataset, but this ok WGS dataset usually have way too much coverage anyway...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants