Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve performance of certain physics queries when using Jolt Physics #101071

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

mihe
Copy link
Contributor

@mihe mihe commented Jan 3, 2025

(This addresses the to-do list item in #99895 titled "Get rid of heap allocations in shape queries when requested hits are less or equal to default" and brings the Jolt Physics module in line with the Godot Jolt extension, which also utilizes this optimization, albeit implemented differently.)

Since most physics queries in Godot have a default upper limit to how many hits they can return (e.g. 32 hits for intersect_point) there is a ripe opportunity for omitting the memory allocations associated with storing these hits in the cases where we don't in fact exceed this default limit, which is almost certainly the vast majority of cases.

This pull request adds a new container to the Jolt Physics module called JoltInlineVector, which acts as a hybrid between a stack-allocated buffer and a heap-allocated one, and switches from the former to the latter once a certain templatized capacity has been exceeded. These containers are sometimes referred to as a "small vector" in other codebases.

This new container is then used in the JoltQueryCollector*Multi classes, which themselves take a templatized default capacity appropriate for the callsite, allowing us to omit1 the memory allocations associated with storing the hits for the following physics queries:

The actual performance benefits of this will likely vary greatly depending on the platform, but in my measurements on Windows (both in Superluminal and measuring with QueryPerformanceCounter) I'm seeing roughly an average reduction of 10-15% CPU time, with certain larger spikes (presumably from occasional page faults) disappearing completely.

Here's a before-and-after profiling of PhysicsDirectSpaceState3D.cast_motion, where the motion vector spans most of the level in GDQuest's Robo Blast demo:

Before
After

I tried to keep the implementation of this new container as minimal as possible, but I'll admit it turned out to be a few lines longer than I hoped it would be. I do still think this optimization is worthwhile though, since physics queries tend to be plentiful in a lot of games.

Footnotes

  1. Note that even with this optimization there are still memory allocations happening when performing physics queries from scripts, as the script interface relies on things like TypedArray<Dictionary> for its results, which still allocate plenty of memory, so this by no means removes all allocations from the queries listed here.

@mihe mihe added this to the 4.x milestone Jan 3, 2025
@mihe mihe requested a review from a team as a code owner January 3, 2025 13:34
@mihe
Copy link
Contributor Author

mihe commented Jan 3, 2025

I'd be curious to hear your take on this @jrouwe, and whether you think it's worthwhile to begin with.

I'm personally so used to seeing this type of container be used in larger game-related codebases that I didn't even bother to do any profiling when making this same change to the extension, and I must admit the improvements were not as drastic as I had imagined. But again, that probably depends greatly on the platform/machine as well.

The fact that Godot fairly liberally allocates memory elsewhere (through the default system allocator) does make this change seem a bit "against the grain" though.

@mihe
Copy link
Contributor Author

mihe commented Jan 3, 2025

I suppose it's also worth discussing whether this should even be its own container or instead just a custom allocator. Ideally it would be a custom allocator for LocalVector, perhaps even added to core/templates/, but LocalVector doesn't support custom allocators (nor aligned allocations) at the moment, so it would have to be for JPH::Array instead.

However, I did find the custom std::allocator that I used in the extension to be quite messy with its various configurations and gotchas, and not much shorter than this container implementation, hence why I went with this more straight-forward approach. Obviously you lose out on a lot of the methods with this custom container though, so in a more general setting it would likely balloon in size quite quickly.

@jrouwe
Copy link
Contributor

jrouwe commented Jan 3, 2025

I think avoiding memory allocations is very useful and a good way to speed things up!

A custom allocator was indeed what I was thinking about too and I thought it would be fun to write one:

jrouwe/JoltPhysics#1435

Turns out it wasn't fun at all, mainly because Jolt can fall back to using std::vector when JPH_USE_STD_VECTOR is defined and the rebind logic is not fun with an allocator like this.

Edit: The PR has been merged now, feel free to use this allocator or to use what you have already written.

@mihe
Copy link
Contributor Author

mihe commented Jan 4, 2025

Turns out it wasn't fun at all, mainly because Jolt can fall back to using std::vector when JPH_USE_STD_VECTOR is defined and the rebind logic is not fun with an allocator like this.

Yeah, that looks about as fun as I remember it being. 😅 I'm glad you caught both the rebinding and non-stateless gotchas.

I appreciate the addition though. I'm not sure I can justify bumping Jolt just for this, but I suppose we could switch to your STLLocalAllocator once we do, whether that's before or after this is merged.

@jrouwe
Copy link
Contributor

jrouwe commented Jan 4, 2025

You could also only add STLLocalAllocator.h and not upgrade the rest as it doesn't depend on any other changes.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants