-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
rdma: support NCCL multi-recv interface #348
Conversation
7c89b7f
to
349a396
Compare
@@ -59,7 +59,8 @@ extern "C" { | |||
#define MIN_TAG_BITS_FOR_RING_ID (32 + 1) | |||
|
|||
/* Maximum number of grouped receives */ | |||
#define NCCL_OFI_MAX_RECVS 1 | |||
#define NCCL_OFI_MAX_RECVS 8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if we're going to break out the two protocols (which makes sense), then we should move these into their corresponding header files. And we really shouldn't leave something named as NCCL_OFI_MAX_RECVS
when it's only sometimes right.
destination buffer */ | ||
typedef struct nccl_net_ofi_rdma_ctrl_msg { | ||
typedef struct nccl_net_ofi_rdma_ctrl_msg_entry { | ||
int multi_recv_tag; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multirecv tag should go at the end for padding reasons.
@@ -25,7 +26,7 @@ nccl_ofi_msgbuff_t *nccl_ofi_msgbuff_init(uint16_t buffer_size) | |||
goto error; | |||
} | |||
msgbuff->buff_size = buffer_size; | |||
if (!(msgbuff->buff = malloc(sizeof(nccl_ofi_msgbuff_elem_t)*buffer_size))) { | |||
if (!(msgbuff->buff = calloc((4*buffer_size), sizeof(nccl_ofi_msgbuff_elem_t)))) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is this magic constant?
NCCL_OFI_WARN("Error locking mutex"); | ||
return NCCL_OFI_MSGBUFF_ERROR; | ||
} | ||
NCCL_OFI_WARN("Error locking mutex"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
own commit.
nccl_ofi_msgbuff_result_t ret = NCCL_OFI_MSGBUFF_ERROR; | ||
|
||
*msg_idx_status = nccl_ofi_msgbuff_get_idx_status(msgbuff, msg_index); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this seems unrelated to multirecv; more details on this change would be good.
In the current implementation of connect/accept, it is possible for `accept` to complete (i.e., return a non-NULL communicator) after the corresponding `connect` returned a NULL communicator (while waiting for a completion for the connection message). This is a strange semantic, and evidently causes NCCL to be unhappy, particularly in the multi-recv case (which is being added in a future commit). So, after sending the connect message, defer waiting for completion; block when closing the send comm if necessary. Signed-off-by: Eric Raut <[email protected]>
These tests do not support multi-recv. Signed-off-by: Eric Raut <[email protected]>
349a396
to
318940b
Compare
Addressed the major points of Brian's review: splitting into more commits and removing abstraction-breaking changes from msgbuff. Some stylistic things are still WIP. Also rebased on master. |
The previous implementation of the ring unit test posts sends before receives, which is incompatible with the current multi-recv behavior of rejecting sends until ctrl information is avaiable. Signed-off-by: Eric Raut <[email protected]>
msgbuff functions accept tag and multi-recv information. RDMA protocol code is updated to pass dummy values for these fields. Multi-recv support for RDMA protocol will be an upcoming commit. Signed-off-by: Eric Raut <[email protected]>
The multi-recv interface allows aggregating up to 8 receive requests in a single request. This commit does not yet advertise support for multi-recv to NCCL. * Temporarily disables eager; it will be re-enabled in a future commit. Signed-off-by: Eric Raut <[email protected]>
RDMA protocol will now support up to 8 multi-recv buffers at a time. Signed-off-by: Eric Raut <[email protected]>
318940b
to
5fddb2e
Compare
"Fixed" hang in ring unit test: 9207e74 |
Please merge master. |
Going to close this PR for now. This branch will need to be rebased, and we need to address the perf issues here before adding this feature. |
The multi-recv interface allows aggregating up to 8 receive requests in a single request. The changes include msgbuff changes to be tag-aware.
connect()
blocking; we need to better understand why this is necessary.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.