-
Notifications
You must be signed in to change notification settings - Fork 5
/
Copy pathNEWS
68 lines (51 loc) · 2.25 KB
/
NEWS
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
==============================================================
HPL-AI Mixed-Precision Benchmark v2.3d -- April 23, 2021
==============================================================
History
- 03/14/2021 HPL-AI Public release of Version 2.3a
(deperated when n > 65536)
Junkang Huang added the parallel GMRES and matrix construction
part (<https://github.com/schuangs/hpl-ai-with-IR>) for HPL-AI.
Note that the GMRES did not check the backward-error, which may
cause the result to be invalid.
Kan Wu reimplemented HPL-AI with some bug fixes and some minor
optimizations. The C souce code of hpl-2.3 remained the same,
while took advantages of modern C++ features such as templates,
overloading, and GPU backends via blaspp:
-- [A] include/hplai*.hh
-- [A] src/*/HPLAI_*.cc
-- [A] testing/*/HPLAI_*.cc
-- [M] configure.ac
-- [M] src/Makefile.am
-- [M] testing/Makefile.am
- 03/24/2021 HPL-AI Public release of Version 2.3b
(deperated when n > 65536)
Kan Wu optimized the device blaspp routines through the
strategy of buffering memory: the program will check whether
the buffer size is sufficient at runtime, free and reallocate
the buffer when necessary. Therefore, it is recommended to
run the benchmark twice with the same parameters to obtain
higher computation rate:
-- [M] src/blas/HPLAI_blas.cc
-- [M] testing/ptest/HPLAI_pdinfo.cc
- 04/18/2021 HPL-AI Public release of Version 2.3c
Kan Wu fixed the bug when n > 65536.
-- [M] src/pgesv/HPLAI_pdgesv.cc
Kan Wu added an option to use the half-precision computeType
in cublasGemmEx (cuda@11: is required):
-- [M] src/blas/HPLAI_blas.cc
Kan Wu renamed the executable "xhplai" to "xhpl_ai", to be
synced to HPL-AI-NVIDIA v1.0.0 in nvidia:hpc-benchmarks:
-- [M] src/Makefile.am
-- [M] testing/Makefile.am
Kan Wu released the software at
<https://github.com/SYSU-SCC/sysu-scc-spack-repo>:
-- [M] testing/ptest/HPLAI_pdinfo.cc
- 04/23/2021 HPL-AI Public release of Version 2.3d:
-- [M] testing/ptest/HPLAI_pdinfo.cc
Kan Wu added support for AscendCL, and added a new strategy
to enable GEMM to use CPU and GPU/NPU at the same time:
-- [M] src/pgesv/HPLAI_blas.cc
Kan Wu added added a compilation option BLAS_ILP64 to better
support ILP64/32 BLAS:
-- [M] src/pgesv/HPLAI_pdgesv.cc