-
Notifications
You must be signed in to change notification settings - Fork 50
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EDGEML-8777: Support firmware log buffer #328
base: main
Are you sure you want to change the base?
Conversation
Can one of the admins verify this patch? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"__packed" is missing for the request and response structs.
8e8ff65
to
2b8b148
Compare
5ac87bf
to
add9c42
Compare
a64f459
to
bf466db
Compare
bf466db
to
b1e6bfd
Compare
ok to test |
=== 3 tests failed on Linux_npu_tests run. |
@VinitAmd , please fix the pipeline failure. The driver is not able to load. Please try your driver on a NPU1 machine. Please contact @vengutta18 for the machine if you don't have one. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not expect any changes for amdxdna_mailbox* files for this feature.
Please cleanup the amdxdna_mailbox* files.
After this is fix, I will review other code.
Please follow https://github.com/amd/xdna-driver?tab=readme-ov-file#checkpatch to setup your workspace.
The coding style needs to pass checkpatch.pl test.
Thanks, I was looking for this info |
7d9850d
to
2d0d0b4
Compare
Code formatting done with checkpatch.pl |
e77d462
to
8d6b7a1
Compare
retest this please |
8d6b7a1
to
8b6da01
Compare
8b6da01
to
6be54a1
Compare
6be54a1
to
2234571
Compare
trigger jenkins |
2234571
to
615feef
Compare
ok to test |
retest this please |
ok to test |
retest this please |
ok to test |
615feef
to
7efb043
Compare
retest this please |
src/driver/amdxdna/aie2_debugfs.c
Outdated
@@ -263,6 +263,45 @@ static int aie2_dpm_level_get(struct seq_file *m, void *unused) | |||
|
|||
AIE2_DBGFS_FOPS(dpm_level, aie2_dpm_level_get, aie2_dpm_level_set); | |||
|
|||
//write debufs for event trace |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Kernel style comments please.
src/driver/amdxdna/aie2_pci.h
Outdated
@@ -250,11 +250,13 @@ struct amdxdna_dev_hdl { | |||
u32 npuclk_freq; | |||
u32 hclk_freq; | |||
bool force_preempt_enabled; | |||
uint32_t event_trace_enabled; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can it be bool?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There may be a option to enable event trace but disable log printing for debug purpose , so kept it uint32.
if we want all the time encode log too added to dmesg, it can be bool
#include "amdxdna_trace.h" | ||
#include "amdxdna_mailbox.h" | ||
|
||
uint8_t g_fwLogBuf[TRACE_EVENT_BUFFER_SIZE]; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please allocate this using kzalloc and clean up when it's no longer needed. Also variable name to fw_log_buf
|
||
static void clear_event_trace_msix(struct amdxdna_dev_hdl *ndev) | ||
{ | ||
// Clear the log buffer interrupt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comments ditto, kernel style.
src/driver/amdxdna/aie2_pci.h
Outdated
int aie2_start_event_trace(struct amdxdna_dev_hdl *ndev, dma_addr_t addr, u32 size); | ||
void aie2_set_trace_timestamp(struct amdxdna_dev_hdl *ndev, uint32_t *resp); | ||
void aie2_assign_event_trace_state(struct amdxdna_dev_hdl *ndev, uint32_t state); | ||
int aie2_is_event_trace_supported_on_dev(struct amdxdna_dev_hdl *ndev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need all the functions here, only keep the functions that are used by other files,rest of them make it static inside the aie2_event_trace.c
|
||
if (ret) { | ||
XDNA_ERR(ndev->xdna, "Send start event trace failed, ret %d", ret); | ||
// Currently this feature is supported on limited HW's, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-line comment style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is there any guide available for kernel comment style ? checkPatch.pl doesn't fix the problem , it only shows the problem.
if no warning/error while checkPatch.pl run hard to fix problem without guide.
return; | ||
} | ||
|
||
ndev->event_trace_enabled = state; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you assign the state once the trace is enabled successfully, It could fail at line no. 55!
src/driver/amdxdna/aie2_message.c
Outdated
req.event_trace_categories = 0xFFFFFFFF; | ||
req.event_trace_timestamp = EVENT_TRACE_TIMESTAMP_FW_CHRONO; | ||
|
||
XDNA_INFO(ndev->xdna, "send start event trace msg"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
make it XDNA_DBG
src/driver/amdxdna/aie2_message.c
Outdated
{ | ||
DECLARE_AIE2_MSG(stop_event_trace, MSG_OP_STOP_EVENT_TRACE); | ||
|
||
XDNA_INFO(ndev->xdna, "send stop event trace msg"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
||
aie2_hw_stop(xdna); | ||
aie2_error_async_events_free(ndev); | ||
if (is_event_trace_supported) | ||
aie2_event_trace_free(ndev); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Couple of things here, are trace_alloc and trace_free are in sync?
what happens during the driver unload or suspend/resume?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- are trace_alloc and trace_free are in sync?---> yes both are in sync
- what happens during the driver unload ---> when driver unload , aie2_fini is called and which triggers stop event trace and free even trace resource.
- suspend/resume -> Currently there is FW issue when PG applied FW loosing event trace buffer information, causing crash in resume (FWDEV-103771
[Why] Log buffer support is required to enhance debugging support on NPU stack [How] 1. Allocate log buffer and share with firmware via 'start_event_trace' message Config start_event_trace request param, send via mgmt channel. Handle the send buffer resp. 2. Receive interrupt about log buffer half fullness from firmware. Handle interrupt, process further buffer data and update buffer head_offset for FW. 3. Add stop_event_trace_send api and it's handle to stop logging when aie2 shutdown. 4. Add event_trace debugfs for dynamic control of logging. Signed-off-by: vinit shukla <[email protected]>
7efb043
to
be8f46f
Compare
[Why]
Log buffer support is required to enhance debugging support on NPU stack
[How]
Signed-off-by: Vinit [email protected]