-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
papi
enters an infinite loop when logical and physical core ID disagree
#241
Comments
Tagging @maartenarnst because he will be interested. |
Also tagging @jrmadsen because this issue critically affects the default binaries/tools provided by the Omnitrace releases (on this particular CPU). |
@adanalis: We should consider adding a configuration option to disable sysdetect, as it is enabled by default. At least, this would give users the option to proceed with PAPI if they encounter issues with the sysdetect component. |
I added a flag to configure that disables the problematic component. The PR is #265. @romintomasetti, could you please test it in your environment? |
@jrmadsen I wonder if |
@romintomasetti, @jrmadsen can you please check if PR269 fixes the problem and works properly on your system: |
Closing as PR #269 should resolve this issue. |
We are observing an infinite loop at
papi/src/components/sysdetect/linux_cpu_utils.c
Lines 769 to 770 in afeb059
What happens is that
papi
wrongly identifies that the CPU has 13 cores per socket.After investigation, it is clearly due to the logical/physical IDs problems.
The part of the code that returns the socket/cores/thread count is not able to handle such a case. See
papi/src/components/sysdetect/x86_cpu_utils.c
Lines 504 to 510 in afeb059
If on that
CPU
I runpapi_hardware_avail
ingdb
and after some time kill it, it was at (note that it tries to get stuff for ID48
though the system has only from 0 to 47):This is the output of
lstopo
for this CPU (physical indices):The text was updated successfully, but these errors were encountered: