From 4d5212fe7a0602ee2a69a05325a4260e687c918a Mon Sep 17 00:00:00 2001 From: Yanfei Guo Date: Tue, 24 Sep 2024 15:51:22 -0500 Subject: [PATCH] ch4/posix: making topology aware SHM default to enabled Fix the performance degradation on Intel Sapphire Rapids after introducing topo-aware SHM. This problem only happens when building with Intel compiler. The problem was topo-aware default to disabled. It uses regular memcpy for inter-NUMA message which is different from v4.2.2 (uses non-temporal copy). The reason this is disabled by default was due to using non-temporal copy results in higher latency in small message. After more testing with different CPUs (broadwell, skylake, cascade, icelake, milan), It seems only skylake, cascade and icelake has this issue on small message. It is probably OK to make topo-aware SHM default to enabled. --- src/mpid/ch4/shm/posix/posix_init.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/src/mpid/ch4/shm/posix/posix_init.c b/src/mpid/ch4/shm/posix/posix_init.c index ecb64c7d2e6..a6b93984933 100644 --- a/src/mpid/ch4/shm/posix/posix_init.c +++ b/src/mpid/ch4/shm/posix/posix_init.c @@ -40,7 +40,7 @@ - name : MPIR_CVAR_CH4_SHM_POSIX_TOPO_ENABLE category : CH4 type : boolean - default : false + default : true class : none verbosity : MPI_T_VERBOSITY_USER_BASIC scope : MPI_T_SCOPE_ALL_EQ