-
Notifications
You must be signed in to change notification settings - Fork 4.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
setenv()
is not thread safe
#46002
Comments
assign core |
New categories assigned: core @Dr15Jones,@makortel,@smuzaffar you have been requested to review this Pull request/Issue and eventually sign? Thanks |
cms-bot internal usage |
A new Issue was created by @makortel. @Dr15Jones, @antoniovilela, @makortel, @mandrenguyen, @rappoccio, @sextonkennedy, @smuzaffar can you please review it and eventually sign/assign? Thanks. cms-bot commands are listed here |
Here are some examples of the present cmssw/PhysicsTools/TensorFlow/src/TensorFlow.cc Lines 90 to 98 in 6e31ebe
that is called from e.g.
|
I tested that modifications to the One option would be to implement the environment variable setting purely in the python side. I think the solution would have to
|
Actually the next step should be a more detailed analysis of the existing use cases of setting the environment variables in CMSSW code. We should also find a way to get an understanding if any 3rd party code calls |
It seems to me the most likely cmssw/PhysicsTools/TensorFlow/src/TensorFlow.cc Lines 90 to 98 in 5587561
Unsafe calls to
Possibly unsafe call:
Acceptable calls (in single-threaded test code in
Problems with this pattern
Therefore, I would remove the |
cmssw/GeneratorInterface/RivetInterface/plugins/RivetAnalyzer.cc Lines 53 to 73 in 5587561
Would it be feasible to set |
cmssw/GeneratorInterface/SherpaInterface/src/SherpackUtilities.cc Lines 154 to 161 in 5587561
Seems that this code was added in #21682 to work around a "feature" of OpenMPI 2.0 and 2.1 (more information in #21419, https://hypernews.cern.ch/HyperNews/CMS/get/edmFramework/3807/1/1/1.html, https://www.open-mpi.org/faq/?category=osx#startup-errors-with-open-mpi-2.0.x). Our OpenMPI version is now 4.1.6, maybe this workaround is no longer needed? @cms-sw/generators-l2 For future reference, the |
cmssw/HeterogeneousCore/MPIServices/src/MPIService.cc Lines 46 to 51 in 5587561
As Services are constructed in the serial part of |
cmssw/OnlineDB/SiStripConfigDb/src/SiStripConfigDb.cc Lines 260 to 292 in 5587561
potentially modifying an existing value or a value from configuration. The
|
cmssw/OnlineDB/EcalCondDB/test/LaserSeqToDB.cpp Line 1015 in 5587561
This program looks serial up to that point (unless ROOT would spawn threads), so is safe. |
cmssw/SimG4CMS/ShowerLibraryProducer/plugins/CastorShowerLibraryMaker.cc Lines 686 to 696 in 5587561
to set the CASTOR_SL_PG_MAXE environment variable. As being commented out, it is "safe", but maybe this commented out code could be removed? @cms-sw/simulation-l2
|
assign ml, generators, simulation |
I made a PR removing |
@smuzaffar Building on top of the #!/bin/bash
cat << EOF | gdb --args $@
set pagination off
set breakpoint pending on
# breakpoint between Service construction and first parallel section
break ScheduleItems::initMisc
run
# we don't need that breakpoint anymore
clear ScheduleItems::initMisc
# add additional breakpoint
break setenv
command
where
continue
end
continue
quit
EOF Would it be feasible to run e.g. all runTheMatrix workflows for the default IB (on the default architecture) e.g. once a week? |
sounds reasonable to me @makortel . Any objections @cms-sw/ml-l2 ? |
right, scram can set there variables properly and we should do it. I will open cmsdist PR |
Sure we can do that. So how will it working ? Should we just use |
Maybe, I didn't really look into details. Where should we place the script? (in principle the script could also be extended to catch other unwanted functions, but maybe it's better to not overgeneralize) |
@makortel , in previous releases of Geant4 a campaign was carried out to substitute setenv by std:setenv and for dramatic reduction of number of calls to std::setenv. So, Geant4 should not bring any problem of this kind. Concerning Castor shower library I will make PR soon. |
these we can set via scram global runtime hooks and that should fix the existing/old releases too. as code @makortel , should we do it globally or do you want to fix it in 14.2.X for now? |
I'm fine with either way. |
Just to clear up confusion, do you mean
Thanks! |
Assuming from #46065 (comment) that @cms-sw/ml-l2 is ok with the change, I opened a PR to cmsdist to set the environment variable cms-sw/cmsdist#9418 |
please stop tagging me, I'm the wrong guy |
Apologies! |
Tagging the right person this time @mseidel42 ... |
I have added cms-sw/cms-common@0471f57 which should do what cmssw/GeneratorInterface/RivetInterface/plugins/RivetAnalyzer.cc Lines 53 to 73 in 5587561
RIVET_REF_PATH or RIVET_INFO_PATH are not set then set these pointing to src/GeneratorInterface/RivetInterface/data under LOCALTOP and RELEASETOP .
When deployed then this script will run for all cmssw releases and set |
@makortel , sorry, of course, getenv. |
+simulation |
We also have |
I went through them, and all the ones needing attention seemed to be related to CondDB access. I opened a separate issue for them #47038. |
This issue spins off from #44659 after the discovery that our code is calling
setenv()
in the parallel part ofcmsRun
, and concurrently tostd::getenv()
calls (#44659 (comment)).We need to first analyze the existing cases to call
setenv()
, and remove those for which there is a better way.If there are any legitimate use cases left, we need to figure out a mechanism for setting environment variables during the serial part of
cmsRun
(Service constructors would satisfy the requirement, but we might want something else than moving all thesetenv()
code to new Services).Then we need to migrate the existing code calling
setenv()
.The text was updated successfully, but these errors were encountered: