Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Float point error in methratio.py #31

Open
tuhina3184 opened this issue Oct 21, 2020 · 4 comments
Open

Float point error in methratio.py #31

tuhina3184 opened this issue Oct 21, 2020 · 4 comments

Comments

@tuhina3184
Copy link

Hi,

I tried running methratio.py using the following command:
methratio.py -o CN_2_N_methratio.txt -d hg19.fa --pair -z -m 5 CN_2_N.bam

I got the following type error :
[methratio] @Wed Oct 21 16:13:32 2020 Using 90% of available memory (11870 MB) as limit
[methratio] @Wed Oct 21 16:13:32 2020 Presorting inputs
[methratio] @Wed Oct 21 16:13:32 2020 Processing 5 chromosomes at a time
Traceback (most recent call last):
File "/usr/bin/methratio.py", line 565, in
main()
File "/usr/bin/methratio.py", line 125, in main
chromPool = ChromPool(maxChromProcs)
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anoconda/lib/python3.7/multiprocessing/pool.py", line 176, in init
self._repopulate_pool()
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anoconda/lib/python3.7/multiprocessing/pool.py", line 231, in _repopulate_pool
for i in range(self._processes - len(self._pool)):
TypeError: 'float' object cannot be interpreted as an integer

Could you please help me with this.
All dependencies like samtools and python modules are fulfilled.

@zyndagj
Copy link
Owner

zyndagj commented Oct 28, 2020

Python3 support is still on my wishlist, so I bet that's the cause of the issue you're experiencing. Can you try creating a new conda environment with python2 as follows:

conda create -n bsmapz_env -c bioconda -c conda-forge -c zyndagj "python=2.7" bsmapz

I am hoping to solve this issue with my next major update, but will update the documentation in the meantime.

@tuhina3184
Copy link
Author

Hi, thankyou for replying. I did make a bsmapz_env as you recommended but I am still getting an samtools processing error.
Below is the log file and error in it :

[methratio] @mon Nov 2 13:52:33 2020 Using 90% of available memory (376 MB) as limit
[methratio] @mon Nov 2 13:52:33 2020 Presorting inputs
[methratio] @mon Nov 2 13:52:33 2020 Calling samtools sort on ../CN_2_N_100.bam and using 5 MB of memory
[bam_sort_core] merging from 70656 files and 64 in-memory blocks...
[E::hts_open_format] Failed to open file ../CN_2_N_100.tmpSrt.bam_tmp.1020.bam
samtools sort: fail to open "../CN_2_N_100.tmpSrt.bam_tmp.1020.bam": Too many open files
Traceback (most recent call last):
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/bin/methratio.py", line 590, in
main()
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/bin/methratio.py", line 116, in main
sortedFiles = map(lambda x: sortFile(x, N=options.np, M=options.mem), options.infiles)
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/bin/methratio.py", line 116, in
sortedFiles = map(lambda x: sortFile(x, N=options.np, M=options.mem), options.infiles)
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/bin/methratio.py", line 187, in sortFile
sp.check_call('samtools sort -m %iM -@ %i -O bam -o %s -T %s_tmp %s'%(samSortMem, N, sortedFile, sortedFile, infile), shell=True)
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/lib/python2.7/subprocess.py", line 190, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command 'samtools sort -m 5M -@ 64 -O bam -o ../CN_2_N_100.tmpSrt.bam -T ../CN_2_N_100.tmpSrt.bam_tmp ../CN_2_N_100.bam' returned non-zero exit status 1

NOTE : We are using bsmapz 1.1.3 version of your releases.

@zyndagj
Copy link
Owner

zyndagj commented Nov 2, 2020

It looks like your system doesn't have enough free memory this time. It looks like it only gave samtools 5MB of memory and it created more than 70,000 files, which would probably overload a distributed filesystem.

Can you try re-running when you have more memory available?

@tuhina3184
Copy link
Author

Yes. I re-ran it with larger memory system and samtools sorting seemed to work. But I am still getting an OverFlow Error :

[methratio] @mon Nov 2 18:49:59 2020 Presorting inputs
[methratio] @mon Nov 2 18:49:59 2020 Calling samtools sort on CN_2_T_100.bam and using 468 MB of memory
[bam_sort_core] merging from 704 files and 64 in-memory blocks...
[methratio] @mon Nov 2 19:51:34 2020 Processing 14 chromosomes at a time
[methratio] @mon Nov 2 19:51:36 2020 Reading chr21 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:36 2020 Reading chr20 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:36 2020 Reading chr19 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:36 2020 Reading chr16 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:36 2020 Reading chr18 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:36 2020 Reading chr17 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:37 2020 Reading chr15 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:37 2020 Reading chr10 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:37 2020 Reading chr14 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:37 2020 Reading chr13 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:38 2020 Reading chr12 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:38 2020 Reading chr11 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:40 2020 Reading chr2 from CN_2_T_100.tmpSrt.bam with samtools
[methratio] @mon Nov 2 19:51:40 2020 Reading chr1 from CN_2_T_100.tmpSrt.bam with samtools
Traceback (most recent call last):
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/bin/methratio.py", line 590, in
main()
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/bin/methratio.py", line 138, in main
ret = chromPool.map(chromWorker, argList, chunksize=1)
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/lib/python2.7/multiprocessing/pool.py", line 253, in map
return self.map_async(func, iterable, chunksize).get()
File "/MGMSTAR1/SHARED/ANALYSIS/APPS/external/anaconda3_10/envs/bsmapz_env/lib/python2.7/multiprocessing/pool.py", line 572, in get
raise self._value
OverflowError: unsigned short is greater than maximum

I was reading about this error but I read in a post of yours that this error has been fixed in 1.1.2 release and I am using 1.1.3 version. Kindly help me this.
Thankyou

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants