-
Notifications
You must be signed in to change notification settings - Fork 59
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(tracing): add nvtx provider #363
Closed
Closed
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This doesn't belong here and is likely a copy/paste mistake. It's unclear why this didn't break anything: this could potentially cause very bizarre header inclusions under neuron when tracing is enabled.
Resolve nvToolsExt and hook into configuration. Implementation follows in future commits.
Separate each tracepoint/profiling provider (currently, just the one) into its own header.
rauteric
reviewed
Mar 19, 2024
Add an api `NCCL_OFI_TRACE_POP()' for closing a tracing scope. WIP: needs to accept a scope name to close.
Hook nvtx_push()/nvtx_pop() on existing lttng macros. WIP: Currently, this builds correctly but is untested. Will remain WIP until we figure out how to structure this in a way that aligns the required usages of NCCL_OFI_TRACE_POP() for nvtx with cases like NCCL_OFI_TRACE_SEND_WRITE_SEG COMPLETE/START. Probably, we should just redo all the lttng macros so that they all wrap a workload, rather than today where the majority just signal that an event took place. I would also like to support a separate course-grained type of probe definition within this module. lttng and nvtx are best-suited for fine-grained/range-based eventing around program behavior (not quite what we have today, but where we want to get: things like wrapping an entire event and/or supplying rich metadata around that event.) For this, we need to support: 1. NVTX because of the ecosystem this plugin exists in. 2. Something that's cheaper than userspace uprobe (see bpftime below) and in-process or nearly so. Some candidates: perfetto, redoing the existing lttng support, etc. Separate from this, we should also support builds with course entry/exit USDT probes for basically all nontrivial functions. This can be a lot more useful for development and for building debug tools. Some tooling that this would enable: + very generic and allows for cross-dependency analysis + https://github.com/eunomia-bpf/bpftime + bpftrace or bcc makes this cheap + certain `linux perf` calls can benefit from this. + potential to profile kernel via kprobes in the same script. + offcpu analysis These are just nop sleds and have zero runtime overhead; so they can be enabled on default/release builds. (See: [1] for how others use this) It's surprisingly difficult to do this in a way that does not require code changes. Can potentially do this with a small out-of-tree llvm pass (and/or a gcc equivilent, see "gcc python plugin" on github) that piggy-backs on -finstrument-functions's __cyg_profile_func_exit and __cyg_profile_func_entry calls. Putting the USDT probe in the __cyg_profile_func_exit impl itself is not viable. Need to dig more. [1]: https://www.brendangregg.com/Slides/reInvent2019_BPF_Performance_Analysis/
AmedeoSapio
reviewed
Mar 19, 2024
AmedeoSapio
reviewed
Mar 19, 2024
Closing this one in favor of #377. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description of changes:
Please see notes on
!wip
commits.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.