You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.8/docs/FAQ.md:
yes Describe the issue:
(A clear and concise description of what the issue is.)
I was running deepvariant_pangenome_aware_deepvariant on vg Giraffe-mapped BAM files. However, part of the sample encountered a Process ForkProcess issue. It didn’t throw an error, didn’t terminate properly, and produced no output files. Setup
Operating system: slurm
DeepVariant version: 1.8.0
Installation method (Docker, built from source, etc.): singularity pull
Type of data: (sequencing instrument, reference genome, anything special that is unlike the case studies?)
Illumina human 30x WGS, vg Giraffe-mapped HPRC Steps to reproduce:
Error trace: (if applicable)
The logs indicate the program was running normally until encountering the following issues:
2025-01-18 22:43:10.537301: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1';
2025-01-18 22:43:10.537341: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)
I0118 22:43:10.537735 47448200671232 call_variants.py:918] call_variants: env = {'BASH_FUNC_module()': '() { eval /usr/bin/modulecmd bash $*\n}', 'SH
I0118 22:43:10.659484 47448200671232 call_variants.py:785] Total 1 writing processes started.
I0118 22:43:10.661774 47448200671232 call_variants.py:796] Use saved model: True
I0118 22:43:10.665955 47448200671232 dv_utils.py:325] From /path/dpvariant/make_examples_pangenome_aware_dv.t
I0118 22:43:21.476414 47448200671232 dv_utils.py:325] From /opt/models/pangenome_aware_deepvariant/wgs/example_info.json: Shape of input examples: [200,
I0118 22:43:21.476675 47448200671232 call_variants.py:814] example_shape: [200, 221, 7]
Process ForkProcess-1:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/tmp/Bazel.runfiles_yqt9b630/runfiles/com_google_deepvariant/deepvariant/call_variants.py", line 551, in post_processing
item = output_queue.get(timeout=180)
File "/usr/lib/python3.10/multiprocessing/queues.py", line 114, in get
raise Empty
_queue.Empty
I0118 22:46:46.215257 47448200671232 call_variants.py:891] Predicted 1024 examples in 1 batches [19.962 sec per 100].
I0118 23:42:47.613373 47448200671232 call_variants.py:967] Complete: call_variants.
Does the quick start test work on your system?
Yes, the quick start test works, and most of the samples finish normally.
Any additional context:
Initially, I thought the issue was caused by the small model, so I added the --disable_small_model parameter. While this allowed some samples to run successfully, the same issue persists for other samples.
The text was updated successfully, but these errors were encountered:
@kishwarshafin Hi, I will try and it's still running. In the meantime, I found that when I reran the same code (chr6:28000000-35000000), part of the previously failed sample ran successfully. This suggests that the same code can produce different results, which makes me question the stability of the previously successful runs?
@EEEdyeah are you running on a system that pauses the processes? It seems like in your run, call variants was paused and the queue did not receive anything for 180 secs which is why it got killed. Can you try by setting num cpus to 0 from the command line and see if it still gets killed.
@kishwarshafin Sorry for the late reply. I’m not entirely sure what caused the issue, but I think I’ve found a solution. Running each job on a separate node seems to prevent the error from occurring.
Have you checked the FAQ? https://github.com/google/deepvariant/blob/r1.8/docs/FAQ.md:
yes
Describe the issue:
(A clear and concise description of what the issue is.)
I was running deepvariant_pangenome_aware_deepvariant on vg Giraffe-mapped BAM files. However, part of the sample encountered a Process ForkProcess issue. It didn’t throw an error, didn’t terminate properly, and produced no output files.
Setup
Operating system: slurm
DeepVariant version: 1.8.0
Installation method (Docker, built from source, etc.): singularity pull
Type of data: (sequencing instrument, reference genome, anything special that is unlike the case studies?)
Illumina human 30x WGS, vg Giraffe-mapped HPRC
Steps to reproduce:
Command:
singularity exec -B /path/:/path/ /path/deepvariant_pangenome_aware_deepvariant-1.8.0.sif /opt/deepvariant/bin/run_pangenome_aware_deepvariant
--model_type=WGS
--ref=/path/HPRC.GRCh38.reordered.fa
--reads=/path/$sample_name.surject.GRCh38.sorted.dedup.lefted.realigned.bam
--num_shards=4
--sample_name_reads=$sample_name
--output_vcf /path/$sample_name.deepvariant.vcf.gz
--output_gvcf /path/$sample_name.deepvariant.gvcf.gz
--pangenome /path/HPRC_graph.gbz
--sample_name_pangenome HPRC
--regions chr6:28000000-35000000
--disable_small_model
--intermediate_results_dir /path/dpvariant
Error trace: (if applicable)
The logs indicate the program was running normally until encountering the following issues:
2025-01-18 22:43:10.537301: W tensorflow/compiler/xla/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcuda.so.1';
2025-01-18 22:43:10.537341: W tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:265] failed call to cuInit: UNKNOWN ERROR (303)
I0118 22:43:10.537735 47448200671232 call_variants.py:918] call_variants: env = {'BASH_FUNC_module()': '() { eval
/usr/bin/modulecmd bash $*
\n}', 'SHI0118 22:43:10.659484 47448200671232 call_variants.py:785] Total 1 writing processes started.
I0118 22:43:10.661774 47448200671232 call_variants.py:796] Use saved model: True
I0118 22:43:10.665955 47448200671232 dv_utils.py:325] From /path/dpvariant/make_examples_pangenome_aware_dv.t
I0118 22:43:21.476414 47448200671232 dv_utils.py:325] From /opt/models/pangenome_aware_deepvariant/wgs/example_info.json: Shape of input examples: [200,
I0118 22:43:21.476675 47448200671232 call_variants.py:814] example_shape: [200, 221, 7]
Process ForkProcess-1:
Traceback (most recent call last):
File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
self._target(*self._args, **self._kwargs)
File "/tmp/Bazel.runfiles_yqt9b630/runfiles/com_google_deepvariant/deepvariant/call_variants.py", line 551, in post_processing
item = output_queue.get(timeout=180)
File "/usr/lib/python3.10/multiprocessing/queues.py", line 114, in get
raise Empty
_queue.Empty
I0118 22:46:46.215257 47448200671232 call_variants.py:891] Predicted 1024 examples in 1 batches [19.962 sec per 100].
I0118 23:42:47.613373 47448200671232 call_variants.py:967] Complete: call_variants.
Does the quick start test work on your system?
Yes, the quick start test works, and most of the samples finish normally.
Any additional context:
Initially, I thought the issue was caused by the small model, so I added the --disable_small_model parameter. While this allowed some samples to run successfully, the same issue persists for other samples.
The text was updated successfully, but these errors were encountered: