[AMDGPU] Use local memory in multi_mma ukernel #16929
Job | Run time |
---|---|
11s | |
7m 54s | |
4m 29s | |
10m 25s | |
2m 41s | |
24m 36s | |
9m 18s | |
9m 3s | |
1m 57s | |
6m 49s | |
4m 23s | |
3m 11s | |
9m 29s | |
9m 5s | |
12m 17s | |
1m 49s | |
31s | |
37s | |
1m 0s | |
4s | |
1h 59m 49s |
Job | Run time |
---|---|
11s | |
7m 54s | |
4m 29s | |
10m 25s | |
2m 41s | |
24m 36s | |
9m 18s | |
9m 3s | |
1m 57s | |
6m 49s | |
4m 23s | |
3m 11s | |
9m 29s | |
9m 5s | |
12m 17s | |
1m 49s | |
31s | |
37s | |
1m 0s | |
4s | |
1h 59m 49s |