-
Notifications
You must be signed in to change notification settings - Fork 280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
group: refactor MPIR_Group #7235
base: main
Are you sure you want to change the base?
Conversation
This test requires to access MPICH internals, thus won't be used with the current design.
We no longer use this file.
Hide the internal fields of MPIR_Group from unnecessary access. Outside group_util.c and group_impl.c, it only need assume the MPIR_Lpid integer type, creation routines based on lpid map or lpid stride description, and access routine to look up lpid from a group rank.
For most external usages, we only need MPIR_Group_rank_to_lpid.
5d4843d
to
576d5c7
Compare
Avoid access group internal fields.
Group similar functions together to facilitate refactoring. There is no changes in this commit other than moving functions around. The 4 incl/excl functions are very similar. The 3 difference/intersection/union functions are very similar.
Use MPIR_Group_{rank_to_lpid,lpid_to_rank} to avoid directly access MPIR_Group internal fields. For most group creation routines, just populate an lpid lookup map and call MPIR_Group_create_map to create the group.
* add option to use stride to describe group composition * remove the linked list design
test:mpich/ch3/most |
test:mpich/ch3/most |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good overall. Need some attend w.r.t the reusing of map
across groups.
@@ -103,25 +94,25 @@ int MPIR_Group_create_map(int size, int rank, MPIR_Session * session_ptr, MPIR_L | |||
/* See 5.3.2, Group Constructors. For many group routines, | |||
* the standard explicitly says to return MPI_GROUP_EMPTY; | |||
* for others it is implied */ | |||
MPL_free(map); | |||
*new_group_ptr = MPIR_Group_empty; | |||
goto fn_exit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This goto can be remove since the creation logic is in the else branch now.
* the standard explicitly says to return MPI_GROUP_EMPTY; | ||
* for others it is implied */ | ||
*new_group_ptr = MPIR_Group_empty; | ||
goto fn_exit; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
goto
can be removed.
newgrp->rank = rank; | ||
MPIR_Group_set_session_ptr(newgrp, session_ptr); | ||
newgrp->pmap.use_map = true; | ||
newgrp->pmap.u.map = map; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this shallow copy of map
safe? Can MPICH code free a map
before all groups using it are freed? We probably should refcount the map to safely reuse it across multiple MPIR_Group
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The map is owned by the group. The map is never shared. On the other hand, the MPIR_Group can be shared and it is reference counted.
|
||
newgrp->rank = rank; | ||
MPIR_Group_set_session_ptr(newgrp, session_ptr); | ||
newgrp->pmap.use_map = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is probably a good idea to add MPIR_Assert(stride > 0 && blocksize > 0);
as a defense to potential incorrect usage of the stride
, since there is a few places where these values are hard coded.
|
||
for (int i = 0; i < size; i++) { | ||
newgrp->lrank_to_lpid[i].lpid = map[i]; | ||
/* TODO: build hash to accelerate MPIR_Group_lpid_to_rank */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should add a TODO for optimization that check if a map
is actually a stride
and convert it to stride
. Since all MPIR_Group
calculations internally create map
, we probably end up with most groups using map
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had this done already. It may be in the later PR. I'll find it and move it here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just saw that in #7240 :)
Pull Request Description
Hide the internal fields of MPIR_Group from unnecessary access.
Outside group_util.c and group_impl.c, it only need assume the MPIR_Lpid
integer type, creation routines based on lpid map or lpid stride
description, and access routine to look up lpid from a group rank.
Add feature to use stride to describe group composition
Remove the linked list design in
MPIR_Group_pmap_t
[skip warnings]
Plan
MPIR_Group
so it can be memory-efficient (strided rank map) -- this PRMPIR_Group
rather than the other way aroundlpid
to be device independent, and device layer perform address exchange upon communicator creation.MPI_COMM_WORLD
.MPIR_Group
to representMPIR_Pset
Author Checklist
Particularly focus on why, not what. Reference background, issues, test failures, xfail entries, etc.
Commits are self-contained and do not do two things at once.
Commit message is of the form:
module: short description
Commit message explains what's in the commit.
Whitespace checker. Warnings test. Additional tests via comments.
For non-Argonne authors, check contribution agreement.
If necessary, request an explicit comment from your companies PR approval manager.