Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Multigrid MPI Bug Fix #966

Merged
merged 8 commits into from
Nov 7, 2023

Conversation

lroberts36
Copy link
Collaborator

@lroberts36 lroberts36 commented Oct 23, 2023

PR Summary

It turns out I never bothered to actually test MG with MPI. This PR makes changes to the MG PR for MG to work correctly with MPI. First, it makes some small changes so that empty MeshData objects don't cause errors and adds MPI communicators for GMG fields that are involved in restriction. Second, and more importantly, it corrects a bug in the task list that allowed certain parts of composite GMG levels to run ahead of coarser levels in the hierarchy. This resulted in boundary data not being up to date at coarse-fine boundaries when any rank only owned coarse blocks on a composite grid. The multi-grid task list now includes task list dependencies and regional dependencies to ensure that the coarser levels of the hierarchy have finished before the post prolongation steps take place on finer two-level composite grids. To ensure the correct ordering within task lists, AddMultiGridTasksLevel is now called recursively. I did not need to add extra buffers, which I had mentioned as a possible solution in the Parthenon sync last Thursday.

Once this is reviewed, I will merge it into the main multi-grid PR and then merge that PR into develop.

PR Checklist

  • Code passes cpplint
  • New features are documented.
  • CI has been triggered on Darwin for performance regression tests.

Copy link
Collaborator

@Yurlungur Yurlungur left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

example/poisson_gmg/poisson_package.hpp Show resolved Hide resolved
Copy link
Collaborator

@pgrete pgrete left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM (just two clarification comments/questions)

Apart from the bugfix, I like that we don't rely on implicit block[0] info for MeshData any more (this ties MeshData closer to the Mesh than to the MeshBlock.

example/poisson_gmg/poisson_driver.cpp Outdated Show resolved Hide resolved
Comment on lines +307 to 314
int mg_reg_dep_id = 0;
TaskRegion &region = tc.AddRegion(num_partitions);
for (int i = 0; i < num_partitions; ++i) {
TaskList &tl = region[i];
AddMultiGridTasksLevel(region, tl, none, i, mg_reg_dep_id, max_level, min_level,
max_level);
}

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just double checking: mg_reg_dep_id is passed in as reference and then incremented in that function so that for multiple partition, the same logical function are being put in different regions, correct?
And is that also the indented behavior (rather than having each all blocks in each partition being logically in the same region so that all blocks on the mesh progress region by region)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had just copied what was done in some other code for this where the reg_dep_id was initialized outside of the loop over partitions and then incremented for each partition. It is not clear to me how the regional dependence ids work under the hood, so maybe it is not correct as is. The desired behavior is to have "all blocks in each partition being logically in the same region so that all blocks on the mesh progress region by region".

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In that case, I suggest to defer the question to @jdolence who IIRC wrote the regional dependencies so that we don't run into unexpected issues with pack_size != 1 or != -1

@lroberts36 lroberts36 mentioned this pull request Nov 7, 2023
8 tasks
@lroberts36 lroberts36 merged commit 45a6b27 into lroberts36/add-multi-grid Nov 7, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants