-
Notifications
You must be signed in to change notification settings - Fork 203
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Immediate NaNs running a non-hydrostatic model with an immersed boundary and open boundary conditions and immersed pressure solver #3831
Comments
This isn't true --- our pressure solver can handle any boundary conditions in principle.
I believe this problem is ill-posed. I would first try periodic boundary conditions with a sponge layer on one side. You probably need a radiation condition on the right if you want to use open boundaries...
You can test this by omitting the preconditioner. It might be appropriate to convert this to a discussion, at least until we can prove there is a bug to be fixed. |
@ali-ramadhan interestingly, when I run your exact example things take a little longer to blow up. I wonder what gives: [ Info: ... simulation initialization complete (7.057 seconds)
[ Info: Executing initial time step...
[ Info: ... initial time step complete (36.179 seconds).
iteration: 1, time: 0.1000, U_max=(1.31e-02, 4.98e-03, 5.51e-03)
iteration: 2, time: 0.2000, U_max=(1.30e-02, 4.96e-03, 5.50e-03)
iteration: 3, time: 0.3000, U_max=(1.39e-02, 5.50e-03, 5.45e-03)
iteration: 4, time: 0.4000, U_max=(NaN, NaN, NaN)
iteration: 5, time: 0.5000, U_max=(NaN, NaN, NaN)
iteration: 6, time: 0.6000, U_max=(NaN, NaN, NaN) In any case, when I don't specify the pressure solver and let default to the FFT-based one, things seem to run fine. So imo it does seem to point to the conjugate gradient solver struggling to find a valid solution close to the immersed boundary. PS: There seems to be a small bug when specifying the conjugate solver without specifying the preconditioner: julia> ConjugateGradientPoissonSolver(grid)
ERROR: UndefVarError: `ImmersedBoundaryGrid` not defined
Stacktrace:
[1] ConjugateGradientPoissonSolver(grid::ImmersedBoundaryGrid{…}; preconditioner::Oceananigans.Solvers.DefaultPreconditioner, reltol::Float64, abstol::Float64, kw::@Kwargs{})
@ Oceananigans.Solvers ~/repos/Oceananigans.jl2/src/Solvers/conjugate_gradient_poisson_solver.jl:54
[2] ConjugateGradientPoissonSolver(grid::ImmersedBoundaryGrid{Float64, Bounded, Bounded, Bounded, RectilinearGrid{…}, GridFittedBottom{…}, Nothing, Nothing, CPU})
@ Oceananigans.Solvers ~/repos/Oceananigans.jl2/src/Solvers/conjugate_gradient_poisson_solver.jl:47
``` |
I tried running a non hydrostatic case with open boundaries and a coastal bathymetry the other week and it also NaNed very quickly with the immersed pressure solver but ran fine ish with FFT. I assumed it was my set up but perhaps there is an issue with how the pressure solver deals with boundary conditions? |
Can you produce some code or an MWE? I'm interested in working on this, but I have no cases that have an issue so there's little I can do to make progress. How is the problem forced? I believe @ali-ramadhan's case is ill-posed as I stated. We can try to test this by using a sponge layer (or perhaps proper open boundary conditions) rather than @inline u_inflow(y, z, t) = 0.01
u_bcs = FieldBoundaryConditions(
west = OpenBoundaryCondition(u_inflow),
east = OpenBoundaryCondition(u_inflow)
) which I don't think will work.
Both solvers treat boundary conditions the same way:
(I can explain more how the math works out if there is interest.) Can you set I also suggest playing with |
I also suggest trying
|
For the record, I did try this yesterday with |
Ok great. How about the same problem (including CG pressure solver) but no immersed boundaries? Something to keep an eye on is the number of iterations the CG solver does. It'd be good to report what is going on with those during / prior to breakup. You can call
This is important; the dependence on time step should be diagnosed. |
I'm drafting a reply to the rest but could the problem be that 0.92 doesn't have #3802 released? |
I don't think I understand how this is ill-posed? It is over specified so will not produce physical results but I thought that without a radiating condition this should still not NaN straight away there should just be a lot of reflections from the boundaries? Here is a simplified version of my code: https://github.com/jagoosw/OpenBoundaries.jl/blob/main/validation/bug_mwe.jl I am using a matching scheme that advects the perturbation component (defined as the boundary adjacent minus imposed velocity) out of the domain, and relaxes to the imposed velocity at different rates depending on if it is an inflow or outflow (for the v component the timescale is Inf for outflows to allow it to maximally transmit information out). I can run it with no matching scheme but it needs tiny timesteps because the noise at the boundaries becomes massive. When I use the default pressure solver it kind of works. There are some bugs, for example, there is this weird jet generation on the southern inflowing boundary. I think these would be solved with relaxing regions. mwe.mp4If I run with the CG solver it NaNs "immediately" and is doing ~800 iterations. If I restrict the iterations it doesn't NaN as fast, but generates very high velocities in a random place: mwe_cg.mp4I think there is also an issue with my model that its not respecting when a boundary adjacent cell is immersed so I'll fix that and get back to you. Perhaps the "immediate" NaNs are actually just from the timestep not being small enough as the reflections and bathymetry interactions make some very high velocities (in my case ~40x higher than the inflows) |
I tried it some more and yes it seems like sometimes it takes a few time steps but it always blows up within the first ~5 iterations.
I tried this too and it also blew up for me.
Ah I should have mentioned this in my original post but I did run from an up-to-date Will try some of @glwagner's suggestions and decrease the time step like @jagoosw did! |
Was curious to have a look but I think the link might be dead? |
I updated the original MWE slightly to use a using Printf
using Statistics
using Oceananigans
using Oceananigans.BoundaryConditions: FlatExtrapolationOpenBoundaryCondition
using Oceananigans.Solvers: ConjugateGradientPoissonSolver, fft_poisson_solver
using Oceananigans.Simulations: NaNChecker
L = 100
H = 100
underlying_grid = RectilinearGrid(
CPU(),
Float64,
topology = (Bounded, Bounded, Bounded),
size = (16, 16, 16),
x = (0, L),
y = (0, L),
z = (-H, 0)
)
h = H/2
w = L/5
mount(x, y) = h * exp(-x^2 / 2w^2) * exp(-y^2 / 2w^2)
bottom(x, y) = -H + mount(x, y)
grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))
@inline u_inflow(y, z, t) = 0.01
u_bcs = FieldBoundaryConditions(
west = OpenBoundaryCondition(u_inflow),
east = FlatExtrapolationOpenBoundaryCondition()
)
boundary_conditions = (; u=u_bcs)
model = NonhydrostaticModel(;
grid,
boundary_conditions,
timestepper = :RungeKutta3,
pressure_solver = ConjugateGradientPoissonSolver(
grid;
preconditioner = fft_poisson_solver(grid.underlying_grid)
)
)
simulation = Simulation(model; Δt=0.01, stop_time=60)
function progress(sim)
model = sim.model
@printf(
"iteration: %d, time: %.4f, U_max=(%.2e, %.2e, %.2e)\n",
iteration(sim),
time(sim),
maximum(abs, model.velocities.u),
maximum(abs, model.velocities.v),
maximum(abs, model.velocities.w)
)
@printf(
" reltol=%.2e, abstol=%.2e, solver iterations: %d, residual: (mean=%.2e, abs(max)=%.2e)\n",
model.pressure_solver.conjugate_gradient_solver.reltol,
model.pressure_solver.conjugate_gradient_solver.abstol,
iteration(model.pressure_solver),
mean(model.pressure_solver.conjugate_gradient_solver.residual),
maximum(abs, model.pressure_solver.conjugate_gradient_solver.residual)
)
end
simulation.callbacks[:progress] = Callback(progress, IterationInterval(1))
nan_checker = NaNChecker(fields=model.velocities, erroring=true)
simulation.callbacks[:nan_checker] = Callback(nan_checker, IterationInterval(1))
run!(simulation) Now I'm seeing that it's maxing out the solver iterations at 4096 and the residual (mean and max) is still like 5 orders of magnitude higher than tolerance.
I tried reducing the time step from 0.1 to 0.01 but now it always blows up at iteration 1. Increasing the time step kept it blowing up after ~5 iterations.
Running with
|
This seems to have worked! But I'm not totally sure why. Like with
I thought it was due to the solver diverging, but if you look at the residual at simulation iteration 1, the residual norm is still high for all 4096 solver iterations. |
Solution looks as expected at 16^3 🎉 3831_16.3.mp4Full script: using Printf
using Statistics
using Oceananigans
using Oceananigans.BoundaryConditions: FlatExtrapolationOpenBoundaryCondition
using Oceananigans.Solvers: ConjugateGradientPoissonSolver, fft_poisson_solver
using Oceananigans.Simulations: NaNChecker
using Oceananigans.Utils: prettytime
L = 100
H = 100
underlying_grid = RectilinearGrid(
CPU(),
Float64,
topology = (Bounded, Bounded, Bounded),
size = (32, 32, 32),
x = (-L/2, L/2),
y = (-L/2, L/2),
z = (-H, 0)
)
h = H/2
w = L/5
mount(x, y) = h * exp(-x^2 / 2w^2) * exp(-y^2 / 2w^2)
bottom(x, y) = -H + mount(x, y)
grid = ImmersedBoundaryGrid(underlying_grid, GridFittedBottom(bottom))
@inline u_inflow(y, z, t) = 0.01
u_bcs = FieldBoundaryConditions(
west = OpenBoundaryCondition(u_inflow),
east = FlatExtrapolationOpenBoundaryCondition()
)
boundary_conditions = (; u=u_bcs)
model = NonhydrostaticModel(;
grid,
boundary_conditions,
timestepper = :RungeKutta3,
pressure_solver = ConjugateGradientPoissonSolver(
grid;
preconditioner = fft_poisson_solver(grid.underlying_grid),
maxiter = 100
)
)
simulation = Simulation(model; Δt=0.1, stop_time=600)
function progress(sim)
model = sim.model
@printf(
"iteration: %d, time: %.4f, U_max=(%.2e, %.2e, %.2e)\n",
iteration(sim),
time(sim),
maximum(abs, model.velocities.u),
maximum(abs, model.velocities.v),
maximum(abs, model.velocities.w)
)
@printf(
" reltol=%.2e, abstol=%.2e, solver iterations: %d, residual: (mean=%.2e, abs(max)=%.2e)\n",
model.pressure_solver.conjugate_gradient_solver.reltol,
model.pressure_solver.conjugate_gradient_solver.abstol,
iteration(model.pressure_solver),
mean(model.pressure_solver.conjugate_gradient_solver.residual),
maximum(abs, model.pressure_solver.conjugate_gradient_solver.residual)
)
end
simulation.callbacks[:progress] = Callback(progress, IterationInterval(1))
nan_checker = NaNChecker(fields=model.velocities, erroring=true)
simulation.callbacks[:nan_checker] = Callback(nan_checker, IterationInterval(1))
simulation.output_writers[:fields] =
JLD2OutputWriter(model, model.velocities; filename="3831.jld2", schedule=TimeInterval(10), overwrite_existing=true)
run!(simulation)
using CairoMakie
ds = FieldDataset("3831.jld2")
times = ds["u"].times
xu, yu, zu = nodes(ds["u"][1])
n = Observable(1)
fig = Figure(size=(1000, 500))
title = @lift @sprintf("time = %s", prettytime(times[$n]))
u_surface = @lift interior(ds["u"][$n], :, :, 16)
u_slice = @lift interior(ds["u"][$n], :, 8, :)
ax1 = Axis(fig[1, 1]; title = "u (surface)", xlabel="x", ylabel="y")
hm1 = heatmap!(ax1, xu, yu, u_surface, colorrange=(-0.01, 0.01), colormap=:balance)
Colorbar(fig[1, 2], hm1, label="m/s")
ax2 = Axis(fig[1, 3]; title = "u (xz-slice)", xlabel="x", ylabel="z")
hm2 = heatmap!(ax2, xu, yu, u_slice, colorrange=(-0.01, 0.01), colormap=:balance)
Colorbar(fig[1, 4], hm2, label="m/s")
fig[0, :] = Label(fig, title)
record(fig, "3831.mp4", 1:length(times), framerate=10) do i
n[] = i
end |
If the FFT solver generates a stable solution but the CG solver diverges, then capping the iterations at a lowish number prevents the solution from diverging too far from the "decent" FFT solution and blowing up the simulation.
Can you rephrase this, I don't grasp what you're trying to say. Don't we expect the residual to be large or increasing if the solver is diverging? |
By "ill-posed" I was referring to the over-specification. But if you say that we should find a solution / the problem should not blow up then I agree, "ill-posed" is the wrong word... @ali-ramadhan this could be tested by running the problem without immersed boundaries but still in a scenario in which a non-trivial flow is generated. Also curious whether the successful simulation you produced above also works with |
I forgot to make the repo public, it should work now |
I also want to drop another optimization that can be used here --- using a using Oceananigans.Grids: with_number_type
reduced_precision_grid = with_number_type(Float32, underlying_grid)
pressure_solver = ConjugateGradientPoissonSolver(grid;
preconditioner = fft_poisson_solver(reduced_precision_grid),
maxiter = 10
) I also suggest plotting the divergence to get a handle on how the solver is working. A well-converged solution seems to have isotropically-distributed divergence errors. With looser tolerances, the divergence may be concentrated around the bathymetry (and of course with the naive FFT solver it has a thin layer adjacent to the bathymetry). I also think it would be nice if setting |
What if you restrict to 3-5 iterations? |
Ok I did a bit of investigation and found the following (script pasted below):
I'd like to analyze the divergence more, but it seems that it is substantially higher with From these results we see that Finally I would like to point out that there is probably little to gain, from a computational point of view, in using the FFT-based solver with 100+ CG iterations. If we are using 100+ iterations, we might as well use a faster preconditioner. My script adapted from @ali-ramadhan's: using Printf
using Statistics
using Oceananigans
using Oceananigans.Grids: with_number_type
using Oceananigans.BoundaryConditions: FlatExtrapolationOpenBoundaryCondition
using Oceananigans.Solvers: ConjugateGradientPoissonSolver, fft_poisson_solver
using Oceananigans.Utils: prettytime
N = 16
h, w = 50, 20
H, L = 100, 100
x = y = (-L/2, L/2)
z = (-H, 0)
grid = RectilinearGrid(size=(N, N, N); x, y, z, halo=(2, 2, 2), topology=(Bounded, Periodic, Bounded))
mount(x, y=0) = h * exp(-(x^2 + y^2) / 2w^2)
bottom(x, y=0) = -H + mount(x, y)
grid = ImmersedBoundaryGrid(grid, GridFittedBottom(bottom))
prescribed_flow = OpenBoundaryCondition(0.01)
extrapolation_bc = FlatExtrapolationOpenBoundaryCondition()
u_bcs = FieldBoundaryConditions(west = prescribed_flow,
east = extrapolation_bc)
#east = prescribed_flow)
boundary_conditions = (; u=u_bcs)
reduced_precision_grid = with_number_type(Float32, grid.underlying_grid)
preconditioner = fft_poisson_solver(reduced_precision_grid)
pressure_solver = ConjugateGradientPoissonSolver(grid; preconditioner, maxiter=20)
model = NonhydrostaticModel(; grid, boundary_conditions, pressure_solver)
simulation = Simulation(model; Δt=0.1, stop_iteration=1000)
conjure_time_step_wizard!(simulation, cfl=0.5)
u, v, w = model.velocities
δ = ∂x(u) + ∂y(v) + ∂z(w)
function progress(sim)
model = sim.model
u, v, w = model.velocities
@printf("Iter: %d, time: %.1f, Δt: %.2e, max|δ|: %.2e",
iteration(sim), time(sim), sim.Δt, maximum(abs, δ))
r = model.pressure_solver.conjugate_gradient_solver.residual
@printf(", solver iterations: %d, max|r|: %.2e\n",
iteration(model.pressure_solver), maximum(abs, r))
end
add_callback!(simulation, progress)
simulation.output_writers[:fields] =
JLD2OutputWriter(model, model.velocities; filename="3831.jld2", schedule=IterationInterval(10), overwrite_existing=true)
run!(simulation)
using GLMakie
ds = FieldDataset("3831.jld2")
fig = Figure(size=(1000, 500))
n = Observable(1)
times = ds["u"].times
title = @lift @sprintf("time = %s", prettytime(times[$n]))
Nx, Ny, Nz = size(grid)
j = round(Int, Ny/2)
k = round(Int, Nz/2)
u_surface = @lift view(ds["u"][$n], :, :, k)
u_slice = @lift view(ds["u"][$n], :, j, :)
ax1 = Axis(fig[1, 1]; title = "u (xy)", xlabel="x", ylabel="y")
hm1 = heatmap!(ax1, u_surface, colorrange=(-0.01, 0.01), colormap=:balance)
Colorbar(fig[1, 2], hm1, label="m/s")
ax2 = Axis(fig[1, 3]; title = "u (xz)", xlabel="x", ylabel="z")
hm2 = heatmap!(ax2, u_slice, colorrange=(-0.01, 0.01), colormap=:balance)
Colorbar(fig[1, 4], hm2, label="m/s")
fig[0, :] = Label(fig, title)
record(fig, "3831.mp4", 1:length(times), framerate=10) do i
n[] = i
end |
I encountered similar issues months ago when using the CG Poisson solver. I fixed my code by slightly modifying the algorithm (Yixiao-Zhang@c7983b8/\). The reason is that the CG method can be numerically unstably for positive (or negative) semi-definite matrices. These changes have not been merged into the main repository, and we are stilling discussing the math (#3802). |
Did you encounter the same issue whereby the simulation would immediately NaN (rather than intermittently)? I'd be curious to see your setup in order to have more than one working example to test with. I've made a little progress with regularizing the solver (vs the simpler technique of setting the mean directly). The issue is that while regularization does seem to be able to stabilize the solver, we still experience extremely slow convergence with It's unclear whether these issues are generic to the solver or specific to this boundary condition, so having another unstable case would be useful. |
In my original simulations, I encountered NaNs usually after several hours in wall time, so it was bad for debugging. Luckily, I found a reliable way to get NaNs immediately is to set both My theory is that the current CG iteration solver is numerically unstable. The residual usually decreases quickly for the first several iterations but may increase after that. That is why using a larger |
I think this makes sense. When the residual is reduced to near machine precision then I think this is when the present instability is exposed, which occurs when the search direction is essentially a constant. I wonder if its possible that the instability was observed in the original simulations when, for some random reason of the flow configuration, the CG solution converged especially fast (thereby exposing the instability). |
After some of the work on #3848 I have suspicions that Note that @jm-c made a comment that the MITgcm CG solver also has occasionally issues for "very smooth" problems. I think that issue is related to what @Yixiao-Zhang and @xkykai observed (possibly, because the FFT preconditioner is so effective, we observe this problem more often than MITgcm users do). On the other hand, the difficulty solving problems with |
Can I understand it like this: with a fluid at rest the shock of enabling open boundaries at iteration 0, which has effectively zero timescale, creates a difficult Poisson problem? Once the simulation is chugging along, it seems like the Poisson problem is not as difficult. I'll close this issue since the original MWE works now. It's just a matter of setting |
It does seem like the solution becomes more accurate in time (but unfortunately, the CG solver does not start to converge more quickly...) The issue is interestingly related to this comment in MITgcm (which may be wrong by the way, but nevertheless reflects some observation about the CG solver): For this MWE, the RHS is 0 with
I think this comment is true in a practical sense. Possibly we do need another issue about the non-convergence of the CG solver. I think there is scope for someone to put in some effort to document this (eg come up with tools to evaluate convergence) and present a few nice MWEs (I think in addition to the present one with The way that the Poisson solver and open boundary conditions are intertwined is also a good reason to implement new open boundary condition matching schemes directly in Oceananigans @jagoosw . Note that we are also thinking about how to develop an interface to |
Just for the record, I tried running a simulation with an IBG and I got almost-immediate NaNs for pretty much every value of |
Does it work with the plain FFT (eg no CG iteration)? |
Yes, but unfortunately I haven't been able to translate this behavior to a simple MWE yet. |
I tried to run the headland case with CG but never got it to not blow up |
I'm trying to set up a small test where a zonal velocity forced by open boundary conditions goes around an immersed sea mount, using the new immersed pressure solver.
Unfortunately I get immediate NaNs on iteration 1. Maybe there's a bug somewhere? I know I'm using a bunch of experimental features together so perhaps this is not surprising. My setup could also be bad though.
Curious if anyone has any insights on what's going wrong here.
Does the pressure solver need to be modified to account for non-zero velocities at the boundaries? I guess the FFT pressure solver assumes either periodic or no-penetration at the boundaries, but then shouldn't the conjugate gradient solver converge on the correct pressure with enough iterations? Or maybe not if the pre-conditioner is very wrong?
"MWE" setup:
Output:
cc @tomchor
The text was updated successfully, but these errors were encountered: