-
Notifications
You must be signed in to change notification settings - Fork 398
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
prov/cxi: Update main for CXI provider #9919
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In addition to the comments below, there is no need to include fi_cxi.7
since that will be automatically updated from fi_cxi.7.md
.
Looking through the commits, some are "working in progress", some are fixing issues introduced by a previous one, some are addressing internal reviews of previous ones. I would expect such commit be squashed so that each commit can stand alone. Also the commit titles starting with 'SSHOTPLAT' need to be changed. Those acronyms are unknown to this project. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See previous comments.
Resolved the SSHOTPLAT issue. I didn't catch that.
Understood. I'm trying to cherry-pick from the internal branch to migrate these to ofiwg. I'm still working with the team on a process for moving the development directly to ofiwg as it makes sense. I think that this is a rather natural set of commits, and one that we would observe from most providers if they were analyzed over a period of time such as captured by this PR (2 months of work). The fact that the PR is this large is my fault. I've been swamped with other things and I haven't had time to update it weekly as planned. |
dae9a68
to
966c2ca
Compare
Signed-off-by: James Swaro <[email protected]>
EP objects will be able to support different EP protocols. Currently on the existing portals SAS implementation is supported: FI_PROTO_CXI. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit b991fd4)
This refactors EP object ctrl elements related to side-band messaging and MR into its own structure. While this information is exclusively accessed for standard EP, it will be owned by the SEP (where MR are bound to the SEP) and shared among TX/RX contexts. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 28c0faa)
No functional changes; refactors code to have ep_obj reference the txc and rxc via a pointer. This will allow an ep_obj to support multiple context specializations that implement different endpoint protocols. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 87c50c8)
NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 80f01f7)
NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 4f1dcf9)
This commit does not alter functionality, it refactors the existing default RXC context into a common base and protocol specific. The default protocol is FI_PROTO_CXI that is implemented by the rxc_hpc derived object. It implements an HPC capable SAS protocol with unexpected messages buffered at the target, and requires a Portals flow control implementation. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 1552c80)
This commit does not alter functionality, it refactors the existing default TXC context into a common base and protocol specific. The default protocol is FI_PROTO_CXI that is implemented by the txc_hpc derived object. It implements an HPC capable SAS protocol with unexpected messages buffered at the target and includes rendezvous messaging. It requires a Portals flow control implementation. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit fb15795)
Refactor so that context allocation is not entangled with EP object initialization. This will allow for contexts to do specialized initialization of structure at calloc. No functional difference. NETCASSINI-5662 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit f368168)
Allocation of a TXC/RXC will allocate and initialize the appropriate derived context object. Context initialization is not longer entangled with EP object initialization. Introduces concept of TXC/RXC ops functions that execute derived object specific code. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 5a5661e)
Refactor context initialization to make derived object initialize only what it needs. For example overflow and request buffers are only required for HPC derived object. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit f52389e)
Refactor context disable to call into derived object for cleanup if operation is supported. No new functionality is added; HPC messaging specific cleanup is moved to helper operation. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 5482af1)
Refactors code to allow a derived context to implement protocol specific progress. This will allow future protocols with different progress demands not impact existing protocols. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 24ea37c)
Allow RXC/TXC specific cancel functions. This will allow the client/server object to support TX cancel when implemented. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 581a68f)
Add RXC op to implement a control messaging callback which can override processing of control messaging events. This allows a context protocol to implement a specific side-band messaging implementation. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 95aa467)
Refactor code to allow derived RXC/TXC to have unique respective recv_common and send_common functionality. Future protocol will integrate seamlessly into API flow. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 7cd4f60)
Move HPC specific protocol code to new file cxip_msg_hpc.c while leaving common protocol code in cxip_msg.c. This only refactors the code. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit d292b5b)
Adds the file cxi/src/cxip_msg_cs.c NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit fe0d962)
Return fi_info for new protocol, protocol must be explicitly requested if hints are passed. Note that if FI_CXI_COMPAT=2, only old constants are used and new protocol is not present. Update/add unit tests to validate fi_info and selection. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 550516f)
Add initial FI_PROTO_CXI_CS derived rxc/txc structure initialization and man page update. NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 1d10c13)
NETCASSINI-5652 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit c6516da)
VNI is provided by the WLM, and must be provided in the multicast creation command. Replace json_fmt static const string with inline string. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit db3e832)
FM REST api now uses a Bearer token, not x-xenon-auth-token. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit d09cfb6)
CURL errors should be logged to stderr. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 9abaeef)
In production, we want to optionally support peer verification. In testing, we generally do not. This can now be specified using environment variable CURLOPT_SSL_VERIFYPEER to bee 0 (do not verify) or 1 (verify). The default is 0. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit ef30f59)
Evaluate the simulation mode once, and set mc_obj->is_multicast appropriately. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 1131419)
Allow retries to be disabled for test cases. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 9e05ee0)
Add COMM_KEY_NONE to _gen_tx_dfa() function. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 5c0bd49)
Allow CURL operations to be traced independently of JOIN. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 9457dd0)
Cleanup. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit d383b51)
FM now generates a full 6-octet NIC address. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 75d0879)
Add flag to suppress repeated logging during CURL polling. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit c2ea999)
Remove unused pid_idx value in cxip_join_state structure. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 0ab258f)
Change minimum test size to 2 (endpoints), from 4. Add "/op" to performance output to clearly indicate that the performance value is per-operation, not a total runtime. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit f215798)
Added SLURM and FI_CXI environment variable capture. Changed error output to stderr (not stdout). Removed placeholder defaults for environment variables. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 80e756f)
Change cxip_trace_filename to cxip_trace_pathname and allow tracing to occur in alternate directories, which is useful when the current path is not writable by the user. Initialization fails without initializing if no masks are selected, preventing creation of empty files. Early model of initializing only once at test login was flawed. This now can be initialized, disabled, and re-initialized. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 7a75110)
Checkpoint commit. This code is in development. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit e4354f2)
Comment is incorrect and misleading for PID_IDX value for mcast address. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 5c533ce)
Signed-off-by: Kalyan Kodamagula <[email protected]> (cherry picked from commit 9d1afe8)
Note: all CXIP_TRACE* references changed to CXIP_COLL_TRACE* Note: all cxip_trace* references changed to cxip_coll_trace* The TRACE() macros produce debugging traces to files that can be on a shared file system, or local to a physical node (and could be memory storage) for debugging collectives, which perform coordinated actions across multiple nodes. This not only prevents implicit synchronization of operations through shared file system waits, but also prevent mangling of the output when using normal character buffering from multiple sources, which is usually faster than line buffering. This was originally put together for use with bench tests that are part of the libfabric suite, and required initialization through function calls within the bench tests, which makes this feature unavailable to to external applications. This commit refactors the TRACE() system to allow it to be entirely configured through environment variables, and can be used with production applications. If the ENABLE_DEBUG flag is zero, all of the TRACE featues are removed entirely: embedded TRACE() calls are a syntactically-robust NOOP that does not emit code during compilation. Otherwise, individual trace features must be activated through environment variables, allowing different areas of code to be traced selectively. If no trace features are selected, the trace files are not created. The original design also used function pointer indirection to allow all of the trace functions to be entirely replaced. This was confusing to maintain, and offers no real benefit. The former cxip_coll_trace_enable() function was overloaded with multiple purposes. This has been simplified into cxip_coll_trace_init() and cxip_coll_trace_close(), which are automatically called during coll module initialization, and a global cxip_coll_trace_muted flag that can be used to temporarily mute tracing. This allows repeated reductions (for instance) to be traced during set up, but then muted during a fast loop. Signed-off-by: Joe Nemeth <[email protected]> (cherry picked from commit 3cd63c5)
Use ofi_hmem_* instead of ze_* specific calls NETCASSINI-4994 Signed-off-by: Chuck Fossen <[email protected]> (cherry picked from commit 782cf95)
NETCASSINI-4994 Signed-off-by: Chuck Fossen <[email protected]> (cherry picked from commit 25f5ddc)
Libfabric semantics indicate that fi_cntr_wait() if an error count increment occurs before the threshold is reached. NETCASSINI-5909 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit 9180ae2)
Adds unit tests for verification of fi_cntr_wait() semantic operation with error count increment. NETCASSINI-5909 Signed-off-by: Steve Welch <[email protected]> (cherry picked from commit fa5d0db)
Signed-off-by: James Swaro <[email protected]>
Signed-off-by: James Swaro <[email protected]>
Update of the CXI provider to match current internal state.
Tested using netsim and the libfabric tests. All tests pass.