Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault in Compiler::fgGetPredForBlock on MacOS x64 #111563

Closed
jkoritzinsky opened this issue Jan 18, 2025 · 7 comments · Fixed by #111719
Closed

Segfault in Compiler::fgGetPredForBlock on MacOS x64 #111563

jkoritzinsky opened this issue Jan 18, 2025 · 7 comments · Fixed by #111719
Assignees
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' in-pr There is an active PR which will close this issue when it is merged Known Build Error Use this to report build issues in the .NET Helix tab
Milestone

Comments

@jkoritzinsky
Copy link
Member

jkoritzinsky commented Jan 18, 2025

Build Information

Build: https://dev.azure.com/dnceng-public/cbb18261-c48f-4abb-8651-8cdcb5474649/_build/results?buildId=921531
Build error leg or test failing: System.Net.Sockets.Tests.WorkItemExecution
Pull request: #111512

Error Message

Fill the error message using step by step known issues guidance.

{
  "ErrorMessage": "Segmentation fault: 11  \"$RUNTIME_PATH/dotnet\" exec --runtimeconfig System.Net.Sockets.Tests.runtimeconfig.json --depsfile System.Net.Sockets.Tests.deps.json xunit.console.dll System.Net.Sockets.Tests.dll",
  "ErrorPattern": "",
  "BuildRetry": false,
  "ExcludeConsoleLog": false
}

Known issue validation

Build: 🔎 https://dev.azure.com/dnceng-public/public/_build/results?buildId=921531
Error message validated: [Segmentation fault: 11 "$RUNTIME_PATH/dotnet" exec --runtimeconfig System.Net.Sockets.Tests.runtimeconfig.json --depsfile System.Net.Sockets.Tests.deps.json xunit.console.dll System.Net.Sockets.Tests.dll]
Result validation: ✅ Known issue matched with the provided build.
Validation performed at: 1/18/2025 1:11:29 AM UTC

Report

Build Definition Test Pull Request
923111 dotnet/runtime System.Net.Sockets.Tests.WorkItemExecution #111398
922978 dotnet/runtime System.Net.Sockets.Tests.WorkItemExecution #110818
922813 dotnet/runtime System.Net.Sockets.Tests.WorkItemExecution #111398
922728 dotnet/runtime System.Net.Sockets.Tests.WorkItemExecution #111327
921531 dotnet/runtime System.Net.Sockets.Tests.WorkItemExecution #111512

Summary

24-Hour Hit Count 7-Day Hit Count 1-Month Count
0 5 5
@jkoritzinsky jkoritzinsky added blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' Known Build Error Use this to report build issues in the .NET Helix tab labels Jan 18, 2025
@dotnet-policy-service dotnet-policy-service bot added the untriaged New issue has not been triaged by the area owner label Jan 18, 2025
Copy link
Contributor

Tagging subscribers to this area: @dotnet/area-infrastructure-libraries
See info in area-owners.md if you want to be subscribed.

Copy link
Contributor

Tagging subscribers to this area: @dotnet/ncl
See info in area-owners.md if you want to be subscribed.

@jkotas
Copy link
Member

jkotas commented Jan 18, 2025

Crash in the JIT on background compilation thread.

Stacktrace of the crash:

PROCCreateCrashDump(std::__1::vector<char const*, std::__1::allocator<char const*> >&, char*, int, bool)
PROCCreateCrashDumpIfEnabled
invoke_previous_action(sigaction*, int, __siginfo*, void*, bool)
_sigtramp
Compiler::fgGetPredForBlock(BasicBlock*, BasicBlock*)
Compiler::ThreeOptLayout::GetPartitionCostDelta(unsigned int, unsigned int, unsigned int, unsigned int, unsigned int)
Compiler::ThreeOptLayout::RunGreedyThreeOptPass(unsigned int, unsigned int)
Compiler::ThreeOptLayout::Run()
Compiler::fgSearchImprovedLayout()
ActionPhase<Compiler::compCompile(void**, unsigned int*, JitFlags*)::$_5>::DoPhase()
Phase::Run()
Compiler::compCompile(void**, unsigned int*, JitFlags*)
Compiler::compCompileHelper(CORINFO_MODULE_STRUCT_*, ICorJitInfo*, CORINFO_METHOD_INFO*, void**, unsigned int*, JitFlags*)
Compiler::compCompile(CORINFO_MODULE_STRUCT_*, void**, unsigned int*, JitFlags*)
jitNativeCode(CORINFO_METHOD_STRUCT_*, CORINFO_MODULE_STRUCT_*, ICorJitInfo*, CORINFO_METHOD_INFO*, void**, unsigned int*, JitFlags*, void*)
CILJit::compileMethod(ICorJitInfo*, CORINFO_METHOD_INFO*, unsigned int, unsigned char**, unsigned int*)
invokeCompileMethodHelper(EEJitManager*, CEEInfo*, CORINFO_METHOD_INFO*, CORJIT_FLAGS, unsigned char**, unsigned int*)
invokeCompileMethod(EEJitManager*, CEEInfo*, CORINFO_METHOD_INFO*, CORJIT_FLAGS, unsigned char**, unsigned int*)
UnsafeJitFunction(PrepareCodeConfig*, COR_ILMETHOD_DECODER*, CORJIT_FLAGS*, unsigned int*)
MethodDesc::JitCompileCodeLocked(PrepareCodeConfig*, COR_ILMETHOD_DECODER*, ListLockEntryBase<NativeCodeVersion>*, unsigned int*)
MethodDesc::JitCompileCodeLockedEventWrapper(PrepareCodeConfig*, ListLockEntryBase<NativeCodeVersion>*)
MethodDesc::JitCompileCode(PrepareCodeConfig*)
MethodDesc::PrepareILBasedCode(PrepareCodeConfig*)
TieredCompilationManager::CompileCodeVersion(NativeCodeVersion)
TieredCompilationManager::DoBackgroundWork(unsigned long long*, unsigned long long, unsigned long long)
TieredCompilationManager::BackgroundWorkerStart()
TieredCompilationManager::BackgroundWorkerBootstrapper1(void*)

Note that the crash dump was captured that you should be able to use to find the root cause.

@jkotas jkotas added area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI and removed area-System.Net.Sockets labels Jan 18, 2025
Copy link
Contributor

Tagging subscribers to this area: @JulieLeeMSFT, @jakobbotsch
See info in area-owners.md if you want to be subscribed.

@jkotas jkotas changed the title Segfault in System.Net.Sockets.Tests.dll test suite on MacOS x64 Segfault in Compiler::fgGetPredForBlock on MacOS x64 Jan 18, 2025
@EgorBo
Copy link
Member

EgorBo commented Jan 18, 2025

cc @amanasifkhalid ThreeOptLayout

@amanasifkhalid amanasifkhalid self-assigned this Jan 20, 2025
@amanasifkhalid amanasifkhalid added this to the 10.0.0 milestone Jan 20, 2025
@amanasifkhalid amanasifkhalid removed the untriaged New issue has not been triaged by the area owner label Jan 20, 2025
@jakobbotsch
Copy link
Member

I've been seeing what I think is the same crash in various superpmi-diffs/superpmi-replay jobs, e.g. https://dev.azure.com/dnceng-public/public/_build/results?buildId=923770&view=results. I am able to reproduce the crash but only with the JIT from CI and that one does not have symbols, so the debugging is a bit difficult. I see a crash in what looks like Compiler::ThreeOptLayout::GetPartitionCostDelta.

One thing I noticed is that nothing is checking that ebdTryLast is reachable/has an index:

modified |= RunThreeOptPass(tryBeg, HBtab->ebdTryLast);

@amanasifkhalid
Copy link
Member

amanasifkhalid commented Jan 22, 2025

I've been seeing what I think is the same crash in various superpmi-diffs/superpmi-replay jobs, e.g. https://dev.azure.com/dnceng-public/public/_build/results?buildId=923770&view=results. I am able to reproduce the crash but only with the JIT from CI and that one does not have symbols, so the debugging is a bit difficult. I see a crash in what looks like Compiler::ThreeOptLayout::GetPartitionCostDelta.

Thanks for pointing this out. It seems odd that none of the Debug/Checked JITs are hitting asserts. Were you able to repro the failure with the Release win-x64 JIT from the build you linked? I'm unable to with SPMI replay...

One thing I noticed is that nothing is checking that ebdTryLast is reachable/has an index:

In the above loop where we assign indices, if a block is part of a try region, we update the try region's ebdTryLast pointer to that block. So if the try region's entry block is reachable/has an index, then the last try block we found in the hot region (ebdTryLast) must also be reachable/have an index. I added an assert for this locally, and it didn't catch anything, so I suspect this isn't an issue.

@dotnet-policy-service dotnet-policy-service bot added the in-pr There is an active PR which will close this issue when it is merged label Jan 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-CodeGen-coreclr CLR JIT compiler in src/coreclr/src/jit and related components such as SuperPMI blocking-clean-ci Blocking PR or rolling runs of 'runtime' or 'runtime-extra-platforms' in-pr There is an active PR which will close this issue when it is merged Known Build Error Use this to report build issues in the .NET Helix tab
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants