Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[v1.21.x] Cherry-picked commits for 1.21.0rc2 #9926

Merged
merged 88 commits into from
Mar 21, 2024
Merged

Conversation

j-xiong
Copy link
Contributor

@j-xiong j-xiong commented Mar 21, 2024

No description provided.

darrylabbate and others added 9 commits March 21, 2024 10:13
Signed-off-by: Darryl Abbate <[email protected]>
(cherry picked from commit 96172b7)
fi_cq_readerr is no longer called on uninitialized
err_data and err_data_size in fi_setup.7.md.

Signed-off-by: Rémi Dehenne <[email protected]>
(cherry picked from commit 358422a)
Signed-off-by: Shi Jin <[email protected]>
(cherry picked from commit e24f1c8)
Updates:
- Full support for Intel oneAPI DPC++/C++ compiler
- Improved default tuning for Intel GPUs

Signed-off-by: Scott Breyer <[email protected]>
(cherry picked from commit acde37d)
Signed-off-by: Darryl Abbate <[email protected]>
(cherry picked from commit 4e4deae)
Signed-off-by: Darryl Abbate <[email protected]>
(cherry picked from commit 6e8765f)
Signed-off-by: Darryl Abbate <[email protected]>
(cherry picked from commit bd891fc)
Signed-off-by: Darryl Abbate <[email protected]>
(cherry picked from commit 87a1006)
This is a best-effort attempt at propagating core Libfabric error codes
upwards wherever possible.

Signed-off-by: Darryl Abbate <[email protected]>
(cherry picked from commit b266f14)
jswaro and others added 20 commits March 21, 2024 12:07
Signed-off-by: James Swaro <[email protected]>
(cherry picked from commit c494d00)
EP objects will be able to support different EP protocols.
Currently on the existing portals SAS implementation is
supported: FI_PROTO_CXI.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit b991fd4)
(cherry picked from commit 7d9a79f)
This refactors EP object ctrl elements related to side-band
messaging and MR into its own structure. While this information
is exclusively accessed for standard EP, it will be owned by
the SEP (where MR are bound to the SEP) and shared among TX/RX
contexts.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 28c0faa)
(cherry picked from commit 734741a)
No functional changes; refactors code to have ep_obj reference
the txc and rxc via a pointer. This will allow an ep_obj to
support multiple context specializations that implement different
endpoint protocols.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 87c50c8)
(cherry picked from commit cd7f818)
NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 80f01f7)
(cherry picked from commit 6cbc037)
NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 4f1dcf9)
(cherry picked from commit 17e03da)
This commit does not alter functionality, it refactors
the existing default RXC context into a common base and
protocol specific. The default protocol is FI_PROTO_CXI
that is implemented by the rxc_hpc derived object. It
implements an HPC capable SAS protocol with unexpected
messages buffered at the target, and requires a Portals
flow control implementation.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 1552c80)
(cherry picked from commit a8cb4ac)
This commit does not alter functionality, it refactors
the existing default TXC context into a common base and
protocol specific. The default protocol is FI_PROTO_CXI
that is implemented by the txc_hpc derived object. It
implements an HPC capable SAS protocol with unexpected
messages buffered at the target and includes rendezvous
messaging. It requires a Portals flow control implementation.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit fb15795)
(cherry picked from commit 0ab28b2)
Refactor so that context allocation is not entangled with
EP object initialization. This will allow for contexts
to do specialized initialization of structure at calloc.

No functional difference.

NETCASSINI-5662

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit f368168)
(cherry picked from commit 6314a5b)
Allocation of a TXC/RXC will allocate and initialize the
appropriate derived context object. Context initialization
is not longer entangled with EP object initialization.
Introduces concept of TXC/RXC ops functions that execute
derived object specific code.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 5a5661e)
(cherry picked from commit 76546e9)
Refactor context initialization to make derived object
initialize only what it needs. For example overflow
and request buffers are only required for HPC derived
object.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit f52389e)
(cherry picked from commit ccee0ec)
Refactor context disable to call into derived object
for cleanup if operation is supported. No new
functionality is added; HPC messaging specific cleanup
is moved to helper operation.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 5482af1)
(cherry picked from commit 193d0f8)
Refactors code to allow a derived context to implement
protocol specific progress. This will allow future protocols
with different progress demands not impact existing
protocols.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 24ea37c)
(cherry picked from commit 712fce3)
Allow RXC/TXC specific cancel functions. This will allow the
client/server object to support TX cancel when implemented.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 581a68f)
(cherry picked from commit e1061df)
Add RXC op to implement a control messaging callback which
can override processing of control messaging events. This
allows a context protocol to implement a specific side-band
messaging implementation.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 95aa467)
(cherry picked from commit 36054b1)
Refactor code to allow derived RXC/TXC to have unique
respective recv_common and send_common functionality.
Future protocol will integrate seamlessly into API flow.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 7cd4f60)
(cherry picked from commit 396c49c)
Move HPC specific protocol code to new file cxip_msg_hpc.c
while leaving common protocol code in cxip_msg.c. This
only refactors the code.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit d292b5b)
(cherry picked from commit 232d280)
Adds the file cxi/src/cxip_msg_cs.c

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit fe0d962)
(cherry picked from commit 901e053)
Return fi_info for new protocol, protocol must be explicitly
requested if hints are passed. Note that if FI_CXI_COMPAT=2,
only old constants are used and new protocol is not present.

Update/add unit tests to validate fi_info and selection.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 550516f)
(cherry picked from commit 1afb008)
Add initial FI_PROTO_CXI_CS derived rxc/txc structure
initialization and man page update.

NETCASSINI-5652

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 1d10c13)
(cherry picked from commit 1e1642c)
JosephNemeth and others added 26 commits March 21, 2024 12:07
The FM URL provided by the WLM is now the full path to the multicast
creation target endpoint, not just the base of the FM RESET.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit d20e3ae)
(cherry picked from commit 447f741)
VNI is provided by the WLM, and must be provided in the multicast
creation command.

Replace json_fmt static const string with inline string.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit db3e832)
(cherry picked from commit dfa4722)
FM REST api now uses a Bearer token, not x-xenon-auth-token.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit d09cfb6)
(cherry picked from commit 451ee08)
CURL errors should be logged to stderr.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 9abaeef)
(cherry picked from commit ac3d632)
In production, we want to optionally support peer verification. In
testing, we generally do not.

This can now be specified using environment variable
CURLOPT_SSL_VERIFYPEER to bee 0 (do not verify) or 1 (verify). The
default is 0.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit ef30f59)
(cherry picked from commit c774a81)
Evaluate the simulation mode once, and set mc_obj->is_multicast
appropriately.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 1131419)
(cherry picked from commit 336d178)
Allow retries to be disabled for test cases.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 9e05ee0)
(cherry picked from commit bae1e73)
Add COMM_KEY_NONE to _gen_tx_dfa() function.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 5c0bd49)
(cherry picked from commit 7a76318)
Allow CURL operations to be traced independently of JOIN.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 9457dd0)
(cherry picked from commit a48975b)
Cleanup.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit d383b51)
(cherry picked from commit 2727ece)
FM now generates a full 6-octet NIC address.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 75d0879)
(cherry picked from commit c3cf64f)
Add flag to suppress repeated logging during CURL polling.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit c2ea999)
(cherry picked from commit c1e5b1a)
Remove unused pid_idx value in cxip_join_state structure.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 0ab258f)
(cherry picked from commit 7046f4e)
Change minimum test size to 2 (endpoints), from 4.

Add "/op" to performance output to clearly indicate that the performance
value is per-operation, not a total runtime.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit f215798)
(cherry picked from commit e9bbeec)
Added SLURM and FI_CXI environment variable capture.

Changed error output to stderr (not stdout).

Removed placeholder defaults for environment variables.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 80e756f)
(cherry picked from commit 31d9579)
Change cxip_trace_filename to cxip_trace_pathname and allow tracing to
occur in alternate directories, which is useful when the current path is
not writable by the user.

Initialization fails without initializing if no masks are selected,
preventing creation of empty files.

Early model of initializing only once at test login was flawed. This now
can be initialized, disabled, and re-initialized.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 7a75110)
(cherry picked from commit 427ff4b)
Checkpoint commit. This code is in development.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit e4354f2)
(cherry picked from commit e455af9)
Comment is incorrect and misleading for PID_IDX value for mcast address.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 5c533ce)
(cherry picked from commit 136bafc)
Signed-off-by: Kalyan Kodamagula <[email protected]>
(cherry picked from commit 9d1afe8)
(cherry picked from commit 47fcdf8)
Note: all CXIP_TRACE* references changed to CXIP_COLL_TRACE*
Note: all cxip_trace* references changed to cxip_coll_trace*

The TRACE() macros produce debugging traces to files that can be on a
shared file system, or local to a physical node (and could be memory
storage) for debugging collectives, which perform coordinated actions
across multiple nodes. This not only prevents implicit synchronization
of operations through shared file system waits, but also prevent
mangling of the output when using normal character buffering from
multiple sources, which is usually faster than line buffering.

This was originally put together for use with bench tests that are part
of the libfabric suite, and required initialization through function
calls within the bench tests, which makes this feature unavailable to
to external applications.

This commit refactors the TRACE() system to allow it to be entirely
configured through environment variables, and can be used with
production applications.

If the ENABLE_DEBUG flag is zero, all of the TRACE featues are removed
entirely: embedded TRACE() calls are a syntactically-robust NOOP that
does not emit code during compilation.

Otherwise, individual trace features must be activated through
environment variables, allowing different areas of code to be traced
selectively. If no trace features are selected, the trace files are not
created.

The original design also used function pointer indirection to allow all
of the trace functions to be entirely replaced. This was confusing to
maintain, and offers no real benefit.

The former cxip_coll_trace_enable() function was overloaded with multiple
purposes. This has been simplified into cxip_coll_trace_init() and
cxip_coll_trace_close(), which are automatically called during coll module
initialization, and a global cxip_coll_trace_muted flag that can be used to
temporarily mute tracing. This allows repeated reductions (for instance)
to be traced during set up, but then muted during a fast loop.

Signed-off-by: Joe Nemeth <[email protected]>
(cherry picked from commit 3cd63c5)
(cherry picked from commit c983a78)
Use ofi_hmem_* instead of ze_* specific calls

NETCASSINI-4994

Signed-off-by: Chuck Fossen <[email protected]>
(cherry picked from commit 782cf95)
(cherry picked from commit 6da6530)
NETCASSINI-4994

Signed-off-by: Chuck Fossen <[email protected]>
(cherry picked from commit 25f5ddc)
(cherry picked from commit e543faf)
Libfabric semantics indicate that fi_cntr_wait() if an error
count increment occurs before the threshold is reached.

NETCASSINI-5909

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit 9180ae2)
(cherry picked from commit 2a0a1a6)
Adds unit tests for verification of fi_cntr_wait() semantic
operation with error count increment.

NETCASSINI-5909

Signed-off-by: Steve Welch <[email protected]>
(cherry picked from commit fa5d0db)
(cherry picked from commit 87fd2f6)
Signed-off-by: James Swaro <[email protected]>
(cherry picked from commit 135e31a)
Signed-off-by: James Swaro <[email protected]>
(cherry picked from commit 459edef)
@j-xiong
Copy link
Contributor Author

j-xiong commented Mar 21, 2024

@jswaro I cherry-picked the cxi changes here so you don't need to do that.

@j-xiong j-xiong merged commit 1cde918 into ofiwg:v1.21.x Mar 21, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants