Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try reducing memory pressure during object writing #83181

Merged
merged 5 commits into from
Mar 9, 2023

Conversation

MichalStrehovsky
Copy link
Member

This is likely going to be a compile time regression, because we're close to finishing the compilation and the managed memory is going to be freed by the ultimate garbage collector: process termination.

But it looks like we're getting OOM-killed for larger apps around here. The native object writer is going to allocate more native memory without giving the managed GC see there's memory pressure. We might be able to tune this with GC.AddMemoryPressure once we have some sort of estimate of native memory used by the native object writer.

Hoping to unblock dotnet/performance#2919.

Cc @dotnet/ilc-contrib @markples

@ghost
Copy link

ghost commented Mar 9, 2023

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas
See info in area-owners.md if you want to be subscribed.

Issue Details

This is likely going to be a compile time regression, because we're close to finishing the compilation and the managed memory is going to be freed by the ultimate garbage collector: process termination.

But it looks like we're getting OOM-killed for larger apps around here. The native object writer is going to allocate more native memory without giving the managed GC see there's memory pressure. We might be able to tune this with GC.AddMemoryPressure once we have some sort of estimate of native memory used by the native object writer.

Hoping to unblock dotnet/performance#2919.

Cc @dotnet/ilc-contrib @markples

Author: MichalStrehovsky
Assignees: MichalStrehovsky
Labels:

area-NativeAOT-coreclr

Milestone: -

@MichalStrehovsky
Copy link
Member Author

@vitek-karas The "Build CoreCLR Tools Unit Tests linux x64 checked" leg is failing in this PR. Digging in the CI logs, the failure is:

xUnit.net Console Runner v2.4.2+f110e5bee5 (64-bit .NET 8.0.0-preview.1.23110.8)
�[37m  Discovering: Mono.Linker.Tests
�[m�[37m  Discovered:  Mono.Linker.Tests
�[m�[37m  Starting:    Mono.Linker.Tests
�[m�[31;1m    Mono.Linker.Tests.TestCases.All.RequiresCapability(t: "RequiresAccessedThrough") [FAIL]
�[m�[37m      Unexpected warning found: ILC: warning IL3002: Mono.Linker.Tests.Cases.RequiresCapability.RequiresAccessedThrough.AccessThroughNewConstraint.Test<T>(): Using member 'Mono.Linker.Tests.Cases.RequiresCapability.RequiresAccessedThrough.AccessThroughNewConstraint.NewConstraintTestType.NewConstraintTestType()' which has 'RequiresAssemblyFilesAttribute' can break functionality when embedded in a single-file app. Message for --NewConstraintTestType.ctor--.
�[m�[37m      Expected: False
�[m�[37m      Actual:   True
�[m�[30;1m      Stack Trace:
�[m�[37m        /_/src/coreclr/tools/aot/Mono.Linker.Tests/TestCasesRunner/ResultChecker.cs(388,0): at Mono.Linker.Tests.TestCasesRunner.ResultChecker.VerifyLoggedMessages(AssemblyDefinition original, TestLogWriter logger, Boolean checkRemainingErrors)
�[m�[37m        /_/src/coreclr/tools/aot/Mono.Linker.Tests/TestCasesRunner/ResultChecker.cs(165,0): at Mono.Linker.Tests.TestCasesRunner.ResultChecker.AdditionalChecking(ILCompilerTestCaseResult linkResult, AssemblyDefinition original)
�[m�[37m        /_/src/coreclr/tools/aot/Mono.Linker.Tests/TestCasesRunner/ResultChecker.cs(88,0): at Mono.Linker.Tests.TestCasesRunner.ResultChecker.Check(ILCompilerTestCaseResult testResult)
�[m�[37m        /_/src/coreclr/tools/aot/Mono.Linker.Tests/TestCases/TestSuites.cs(68,0): at Mono.Linker.Tests.TestCases.All.Run(String testName)
�[m�[37m        /_/src/coreclr/tools/aot/Mono.Linker.Tests/TestCases/TestSuites.cs(51,0): at Mono.Linker.Tests.TestCases.All.RequiresCapability(String t)
�[m�[37m           at InvokeStub_All.RequiresCapability(Object, Object, IntPtr*)
�[m�[37m           at System.Reflection.MethodInvoker.Invoke(Object obj, IntPtr* args, BindingFlags invokeAttr)
�[m�[37m  Finished:    Mono.Linker.Tests
�[m�[37m=== TEST EXECUTION SUMMARY ===
�[m�[37m   Mono.Linker.Tests  Total: 319, Errors: 0, Failed: 1, Skipped: 0, Time: 18.323s
�[m�[m=== COMMAND LINE ===
"/__w/1/s/.dotnet/dotnet" exec --depsfile "/__w/1/s/artifacts/bin/Mono.Linker.Tests/x64/Checked/Mono.Linker.Tests.deps.json" --runtimeconfig "/__w/1/s/artifacts/bin/Mono.Linker.Tests/x64/Checked/Mono.Linker.Tests.runtimeconfig.json"  "/__w/1/s/.packages/xunit.runner.console/2.4.2/tools/netcoreapp2.0/xunit.console.dll" "/__w/1/s/artifacts/bin/Mono.Linker.Tests/x64/Checked/Mono.Linker.Tests.dll" -noautoreporters -xml "/__w/1/s/artifacts/TestResults/Checked/Mono.Linker.Tests_net8.0_x64.xml" -html "/__w/1/s/artifacts/TestResults/Checked/Mono.Linker.Tests_net8.0_x64.html"  > "/__w/1/s/artifacts/log/Checked/Mono.Linker.Tests_net8.0_x64.log" 2>&1

The CI run in your recent #83085 appears to have run only 305 tests, so this looks like a bad interaction between two PRs.

@MichalStrehovsky
Copy link
Member Author

The latest run in the performance repo is still failing. The heuristic is not kicking in (patting myself on the back for adding logging ("Freeing up memory") when GC.Collect kicks in):

[2023/03/10 00:26:14][INFO]   ILC: Emitting debug information
[2023/03/10 00:26:14][INFO]   ILC: Finalizing output to '/home/helixbot/work/A1E40912/w/B4800A16/e/artifacts/obj/BenchmarkDotNet.Autogenerated/Release/net8.0/linux-x64/native/e877111e-668b-4c90-89ca-b3e1762d847f.o'...
[2023/03/10 00:26:14][INFO]   /home/helixbot/work/A1E40912/t/MSBuildTemphelixbot/tmpb3d3cd41abd8428e9021cdf0ff93eec0.exec.cmd: line 2: 13433 Killed                  "/home/helixbot/work/A1E40912/w/B4800A16/e/artifacts/packages/runtime.linux-x64.microsoft.dotnet.ilcompiler/8.0.0-preview.3.23159.3/tools/ilc" @"/home/helixbot/work/A1E40912/w/B4800A16/e/artifacts/obj/BenchmarkDotNet.Autogenerated/Release/net8.0/linux-x64/native/e877111e-668b-4c90-89ca-b3e1762d847f.ilc.rsp"

Since Done writing object file is missing, we're in FinishObjWriter when the process gets OOM killed.

Should we pull out the heuristic and go with what was there originally?

@jkotas
Copy link
Member

jkotas commented Mar 10, 2023

Should we pull out the heuristic and go with what was there originally?

We can give it a try. I think it would be still worth it to have some heuristic eventually to avoid the useless blocking full GC when there is a plenty of memory available. Maybe 50% is too high - it may need to be less, like 20%.

@VSadov
Copy link
Member

VSadov commented Mar 10, 2023

TotalCommittedBytes is sampled at the last GC. It is possible to see a value that is less than actual.
Perhaps if you ensure that some GC happened recently, like force a blocking gen0 before asking, it would give you more reliable info?

@VSadov
Copy link
Member

VSadov commented Mar 10, 2023

Or perhaps you should look at something like Process.PrivateMemorySize64 instead?

@jkotas
Copy link
Member

jkotas commented Mar 10, 2023

The typical compilation has many GCs. The number from last GC should be good-enough approximation. The check can be very conservative.

@ghost ghost locked as resolved and limited conversation to collaborators Apr 9, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants