Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No bases were counted for orf #57

Open
simone-pignotti opened this issue Mar 28, 2022 · 1 comment
Open

No bases were counted for orf #57

simone-pignotti opened this issue Mar 28, 2022 · 1 comment

Comments

@simone-pignotti
Copy link

Hello, I occasionally run into this issue when running PhySpy:

2022-03-28 12:14:53 INFO     Welcome to PhiSpy.py version 4.2.21
2022-03-28 12:14:53 INFO     Starting PhiSpy.py with the following arguments
Namespace(infile='input/genomic.gbff', output_dir='/home/ec2-user/physpy', make_training_data=None, training_set='data/trainSet_genericAll.txt', list=False, file_prefix='test', evaluate=False, number=5, min_contig_size=5000, window_size=30, nonprophage_genegaps=10, phage_genes=1, metrics=['orf_length_med', 'shannon_slope', 'at_skew', 'gc_skew', 'max_direction'], randomforest_trees=500, expand_slope=False, kmers_type='all', phmms='/home/ec2-user/VOGs.hmms', include_annotations=True, ignore_annotations=False, color=True, threads=4, output_choice=512, include_all_repeats=False, keep_dropped_predictions=False, extra_dna=2000, min_repeat_len=10, log='/home/ec2-user/physpy/test_phispy.log', quiet=False, keep=False, logger=<Logger PhiSpy (Level 5)>)
2022-03-28 12:14:54 INFO     Processing 14 contigs
2022-03-28 12:14:54 INFO     Running HMM profiles against /home/ec2-user/VOGs.hmms
2022-03-28 12:14:54 INFO     hmmsearch: writing the amino acids to temporary file /home/ec2-user/physpy/tmpio1svuew
2022-03-28 12:14:54 INFO     Searching 2613 proteins with hmmsearch.
2022-03-28 12:18:15 INFO     Completed running HMM profiles against /home/ec2-user/VOGs.hmms
2022-03-28 12:18:15 INFO     Making Testing Set...
2022-03-28 12:18:17 INFO     a total of zero total_at*total_gc
No bases were counted for orf {'start': 507191, 'stop': 508927, 'phmm': 0.18568636235841013, 'peg': 'peg', 'is_phage': 0} from 507191 to 508927
This error is usually thrown with an exceptionally short ORF that is only a  few bases. You should check this ORF and confirm it is real!

I can't find anything weird in the ORF throwing the exception. This error makes the entire run fail, which is not what I would expect given that other ORFs are simply ignored and raise warnings (e.g. when there are multiple ORFs with the same ID and all but the first are discarded).
Would it be possible to have more details about what may be triggering the error, and eventually convert this to a warning in future versions of PhySpy? Unfortunately I cannot share the input file, and I haven't managed to replicate the error on a small example, but running PhySpy on many random genomes downloaded from NCBI should enable you to replicate it. Let me know if I can help in any other way.
Thanks for maintaining this great tool!

@simone-pignotti
Copy link
Author

PS this has already been described in #54, but since the main topic of that issue was different I figured this would deserve its own. Feel free to merge them if not!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant