Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inference problem on the CPU #4032

Merged
merged 3 commits into from
Jan 8, 2025
Merged

Conversation

simone-silvestri
Copy link
Collaborator

@simone-silvestri simone-silvestri commented Jan 7, 2025

Working with @milankl I noticed that a stripped-down version of ClimaOcean (with only the HydrostaticFreeSurfaceModel) was terribly slow compared to a SpeedyWeather simulation with the same gridsize.

After some investigating I found out that the problem was the forcing added as an argument to the momentum kernels after the splatting of the other args....
I am not sure what caused this loss of performance, but the benchmark of the code:

using ClimaOcean
using Oceananigans
using Oceananigans.Units
using BenchmarkTools

arch    = Oceananigans.CPU()
r_faces = ClimaOcean.exponential_z_faces(; Nz = 8, depth = 5000)

grid = LatitudeLongitudeGrid(arch, 
                             size=(360, 152, 8), 
                             latitude=(-75.5, 75.5), 
                             longitude=(0, 360),
                             halo = (6, 6, 2),
                             z = r_faces)

ocean = ClimaOcean.ocean_simulation(grid; 
                                    closure=nothing,
                                    momentum_advection=VectorInvariant(),
                                    tracer_advection=Centered(),
                                    free_surface=SplitExplicitFreeSurface(grid; substeps=20))

time_step!(ocean)
time_step!(ocean)

@benchmark time_step!(ocean)

on main:
Screenshot 2025-01-07 at 4 01 16 PM
and on this branch with the correction:
Screenshot 2025-01-07 at 4 11 35 PM

This should not impact GPU execution.

@glwagner
Copy link
Member

glwagner commented Jan 7, 2025

Oh. I think this is basically #2996 coming back. This keeps happening so I wonder if there is a way to test it.

@simone-silvestri
Copy link
Collaborator Author

We might need to add some performance tests. Possibly very simple ones, for example tests that check that there is no extra allocation that comes from types not being inferred correctly.

@simone-silvestri simone-silvestri merged commit 19baa39 into main Jan 8, 2025
46 checks passed
@simone-silvestri simone-silvestri deleted the ss/fix-inference-issue-on-cpus branch January 8, 2025 00:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants