Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in rule run_checkm2: No DIAMOND annotation was generated #747

Open
bioinfogini opened this issue Dec 28, 2024 · 1 comment
Open

Error in rule run_checkm2: No DIAMOND annotation was generated #747

bioinfogini opened this issue Dec 28, 2024 · 1 comment

Comments

@bioinfogini
Copy link

bioinfogini commented Dec 28, 2024

Hello there,
I was running atlas binning and got to a good 94% of the job done when, while running CheckM2, I got stuck.
Print out is the following:

[] INFO: Running CheckM2 version 1.0.2
[] INFO: Running quality prediction workflow with 70 threads.
[] INFO: Calling genes in 1 bins with 70 threads:
Finished processing 1 of 1 (100.00%) bins.
[] INFO: Calculating metadata for 1 bins with 70 threads:
Finished processing 1 of 1 (100.00%) bin metadata.
[] INFO: Annotating input genomes with DIAMOND using 70 threads
[] INFO: Processing DIAMOND output
[] ERROR: No DIAMOND annotation was generated.

As far as I can see, it performed this twice, both on the same sample, and got stuck.

Hope anyone can help or to update here with solution I could find.

UPDATE1: I checked my samples, and realised that the one generating the error was the only one presenting 1 fasta file alone in the /binning/vamb/bins folder. As it would make sense that that only bin is as well low quality and diamond cannot classify it, I (made a copy somewhere else and) removed the sample. Tried then to rerun the analysis. Will update.

Update2: Analysis got stuck again because marvellous Atlas was able to regenerate out-of-dont-know-where the "removed" sample. Pretty smart, but creating me an issue in this case. Will try to remove the sample from samples.tsv file, let's see if I can trick Atlas this way.
NOTE: Atlas get stuck even though I tried to add the --keep-going flag. It is about 96% done, 299 of 311 steps.

@SilasK if anytime you have a moment to jump in would be great! It's starting to be a pretty fun adventure.

@bioinfogini bioinfogini changed the title No DIAMOND annotation was generated Error in rule run_checkm2: No DIAMOND annotation was generated Dec 28, 2024
@bioinfogini
Copy link
Author

bioinfogini commented Jan 9, 2025

It worked.

So, to be precise, I removed my sample from samples.tsv file and I also moved the following files somewhere else (just in case I was wrong):

Sample/
├── Assembly
│   └── fasta
│   └── .fasta
├── Intermediate
│   ├── cobinning
│   │   ├── All
│   │   │   ├── bams
│   │   │   │   └── .sorted.bam
│   │   │   └── vamb_output
│   │   │   └── bins
│   │   └── filtered_contigs
│   │   └── .fasta.gz
│   └── qc
│   └── decontamination
│   └──
│   ├── PhiX_R1.fastq.gz
│   ├── PhiX_R2.fastq.gz
│   ├── Sscrofa_R1.fastq.gz
│   └── Sscrofa_R2.fastq.gz
├── QC
│   └── reads
│   ├── _R1.fastq.gz
│   └── _R2.fastq.gz
└──
├── annotation
│   └── predicted_genes
│   ├── .faa
│   ├── .fna
│   ├── .gff
│   └── .tsv
├── assembly
│   ├── assembly_graph_after_simplification.gfa
│   ├── assembly_graph.fastg
│   ├── assembly_graph_with_scaffolds.gfa
│   ├── before_rr.fasta
│   ├── contigs.fasta
│   ├── contigs.paths
│   ├── contig_stats
│   │   ├── final_contig_stats.txt
│   │   ├── postfilter_coverage_binned.txt
│   │   ├── postfilter_coverage_histogram.txt
│   │   └── postfilter_coverage_stats.txt
│   ├── dataset.info
│   ├── first_pe_contigs.fasta
│   ├── input_dataset.yaml
│   ├── K21
│   │   ├── configs
│   │   │   ├── careful_mda_mode.info
│   │   │   ├── careful_mode.info
│   │   │   ├── config.info
│   │   │   ├── construction.info
│   │   │   ├── detail_info_printer.info
│   │   │   ├── distance_estimation.info
│   │   │   ├── hmm_mode.info
│   │   │   ├── isolate_mode.info
│   │   │   ├── large_genome_mode.info
│   │   │   ├── mda_mode.info
│   │   │   ├── meta_mode.info
│   │   │   ├── metaplasmid_mode.info
│   │   │   ├── metaviral_mode.info
│   │   │   ├── pe_params.info
│   │   │   ├── plasmid_mode.info
│   │   │   ├── rna_mode.info
│   │   │   ├── rnaviral_mode.info
│   │   │   ├── sewage_mode.info
│   │   │   ├── simplification.info
│   │   │   └── toy.info
│   │   ├── final.lib_data
│   │   └── simplified_contigs
│   │   ├── contigs_info
│   │   ├── contigs.off
│   │   └── contigs.seq
│   ├── K33
│   │   ├── configs
│   │   │   ├── careful_mda_mode.info
│   │   │   ├── careful_mode.info
│   │   │   ├── config.info
│   │   │   ├── construction.info
│   │   │   ├── detail_info_printer.info
│   │   │   ├── distance_estimation.info
│   │   │   ├── hmm_mode.info
│   │   │   ├── isolate_mode.info
│   │   │   ├── large_genome_mode.info
│   │   │   ├── mda_mode.info
│   │   │   ├── meta_mode.info
│   │   │   ├── metaplasmid_mode.info
│   │   │   ├── metaviral_mode.info
│   │   │   ├── pe_params.info
│   │   │   ├── plasmid_mode.info
│   │   │   ├── rna_mode.info
│   │   │   ├── rnaviral_mode.info
│   │   │   ├── sewage_mode.info
│   │   │   ├── simplification.info
│   │   │   └── toy.info
│   │   ├── final.lib_data
│   │   └── simplified_contigs
│   │   ├── contigs_info
│   │   ├── contigs.off
│   │   └── contigs.seq
│   ├── K55
│   │   ├── assembly_graph_after_simplification.gfa
│   │   ├── assembly_graph.fastg
│   │   ├── assembly_graph_with_scaffolds.gfa
│   │   ├── before_rr.fasta
│   │   ├── configs
│   │   │   ├── careful_mda_mode.info
│   │   │   ├── careful_mode.info
│   │   │   ├── config.info
│   │   │   ├── construction.info
│   │   │   ├── detail_info_printer.info
│   │   │   ├── distance_estimation.info
│   │   │   ├── hmm_mode.info
│   │   │   ├── isolate_mode.info
│   │   │   ├── large_genome_mode.info
│   │   │   ├── mda_mode.info
│   │   │   ├── meta_mode.info
│   │   │   ├── metaplasmid_mode.info
│   │   │   ├── metaviral_mode.info
│   │   │   ├── pe_params.info
│   │   │   ├── plasmid_mode.info
│   │   │   ├── rna_mode.info
│   │   │   ├── rnaviral_mode.info
│   │   │   ├── sewage_mode.info
│   │   │   ├── simplification.info
│   │   │   └── toy.info
│   │   ├── final_contigs.fasta
│   │   ├── final_contigs.paths
│   │   ├── final.lib_data
│   │   ├── first_pe_contigs.fasta
│   │   ├── path_extend
│   │   ├── scaffolds.fasta
│   │   ├── scaffolds.paths
│   │   └── strain_graph.gfa
│   ├── misc
│   │   └── broken_scaffolds.fasta
│   ├── old2new_contig_names.tsv
│   ├── params.txt
│   ├── pipeline_state
│   │   ├── stage_0_before_start
│   │   ├── stage_1_as_start
│   │   ├── stage_2_k21
│   │   ├── stage_3_k33
│   │   ├── stage_4_k55
│   │   ├── stage_5_copy_files
│   │   ├── stage_6_as_finish
│   │   ├── stage_7_bs
│   │   └── stage_8_terminate
│   ├── run_spades.sh
│   ├── run_spades.yaml
│   ├── _final_contigs.fasta
│   ├── _prefilter_contigs.fasta
│   ├── scaffolds.fasta
│   ├── scaffolds.paths
│   ├── spades.log
│   ├── strain_graph.gfa
│   └── tmp
├── binning
│   └── vamb
│   ├── bins
│   │   └── _vamb_1.fasta
│   ├── checkm2
│   │   ├── checkm2.log
│   │   ├── diamond_output
│   │   └── protein_files
│   │   └── vamb_1.faa
│   ├── cluster_attribution.tsv
│   └── genome_stats.tsv
├── finished_assembly
├── logs
│   ├── assembly
│   │   ├── calculate_coverage
│   │   │   ├── align_reads_from
_to_filtered_contigs.log
│   │   │   └── pilup_final_contigs.log
│   │   ├── post_process
│   │   │   ├── contig_stats_final.log
│   │   │   └── rename_and_filter_size.log
│   │   ├── pre_process
│   │   │   ├── error_correction_QC.log
│   │   │   └── merge_pairs_QC.errorcorr.log
│   │   └── spades.log
│   ├── binning
│   │   ├── get_bins_vamb.log
│   │   └── vamb
│   │   └── checkm2.log
│   ├── gene_annotation
│   │   └── prodigal.txt
│   ├── QC
│   │   ├── decontamination.err
│   │   ├── decontamination.log
│   │   ├── deduplicate.err
│   │   ├── deduplicate.log
│   │   ├── init.log
│   │   ├── quality_filter.err
│   │   ├── quality_filter.log
│   │   ├── read_stats
│   │   │   ├── clean.log
│   │   │   ├── deduplicated.log
│   │   │   ├── filtered.log
│   │   │   ├── QC.log
│   │   │   └── raw.log
│   │   └── stats
│   │   └── calculate_insert_size.log
│   └── _quality_filtering_stats.txt
├── sequence_alignment
│   └── .bam
└── sequence_quality_control
├── finished_QC
├── read_stats
│   ├── clean.zip
│   ├── deduplicated.zip
│   ├── filtered.zip
│   ├── QC_insert_size_hist.txt
│   ├── QC_read_length_hist.txt
│   ├── QC.zip
│   ├── raw.zip
│   └── read_counts.tsv
└── _decontamination_reference_stats.txt

51 directories, 166 files.

I am not closing the Issue just in case @SilasK would like to say this is completely wrong or not 😆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant