BENCH: profile memleaks asv #821

tylerjereddy · 2022-09-27T23:04:47Z

add some asv peak memory benchmarks as regression guards against Memory leak in PyDarshan #811; the only thing missing (IMO) here to completely close TST, BENCH: memory profiling/testing #817 is a version of the log file with HEATMAP in it (see the diff comments for why I don't have this for asv suite at the moment)
although this won't automatically run in CI because it takes too long and too much memory, it still seems to do an excellent job of being sensitive to the patched leaks over the recent history of our project:

asv continuous -e -b "peakmem.*" b6c8599ee8 main

       before           after         ratio
     [b6c8599e]       [3fb2c368]
     <treddy_memleak_cffi_only_pt2~1>       <main>
-            239M              76M     0.32  cffi_backend.RetrieveLogData.peakmem_log_get_record('LUSTRE')
-            241M              76M     0.32  cffi_backend.RetrieveLogData.peakmem_log_get_record('H5F')
-            257M            75.8M     0.30  cffi_backend.RetrieveLogData.peakmem_log_get_record('STDIO')
-            289M              76M     0.26  cffi_backend.RetrieveLogData.peakmem_log_get_record('MPI-IO')
-            303M              76M     0.25  cffi_backend.RetrieveLogData.peakmem_log_get_record('POSIX')
-            304M            75.4M     0.25  cffi_backend.RetrieveLogData.peakmem_log_get_record('DXT_POSIX')
-            304M            75.3M     0.25  cffi_backend.RetrieveLogData.peakmem_log_get_record('DXT_MPIIO')
-            324M            75.8M     0.23  cffi_backend.RetrieveLogData.peakmem_log_get_record('H5D')
-           23.7G            75.4M     0.00  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('DXT_POSIX')
-           23.7G            75.4M     0.00  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('H5F')
-           23.7G            75.4M     0.00  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('H5D')
-           23.7G            75.4M     0.00  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('STDIO')
-           23.7G            75.4M     0.00  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('MPI-IO')
-           23.7G            75.4M     0.00  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('DXT_MPIIO')
-           23.7G            75.4M     0.00  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('LUSTRE')
-           23.7G            75.4M     0.00  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('POSIX')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

tylerjereddy · 2022-09-27T23:10:03Z

darshan-util/pydarshan/benchmarks/benchmarks/cffi_backend.py

+    def peakmem_log_get_record(self, module):
+        for i in range(100000):
+            log = cffi_backend.log_open(self.ior_log_path)
+            rec = cffi_backend.log_get_record(log, module)


Since log_get_record() automatically dispatches to _log_get_heatmap_record() when HEATMAP module is requested, I think we'll just automatically cover our bases for the memory leaks when a log file with += HEATMAP is used here.

tylerjereddy · 2022-09-27T23:11:21Z

darshan-util/pydarshan/benchmarks/benchmarks/cffi_backend.py

+
+
+    def peakmem_log_get_mounts(self, module):
+        for i in range(100000):


The magnitude of the mount leak is so massive we could probably cut this down by 10x so we don't hit peak memory values above the capacity of an average laptop. The current ratios are 0.00 since they span orders of magnitude improvements.

tylerjereddy · 2022-10-03T23:16:35Z

I suppose we might as well augment this PR with the reproducer from gh-824 to enforce the additional leak fix in gh-825.

tylerjereddy · 2022-10-13T02:05:42Z

Was working on diff below the fold when I ran into gh-829. One other thing worth noting is that because the Python build system doesn't drive the building of the C code, the benchmarks are stuck running on a single version of the compiled C code. I think I've brought this up before... but that's a bigger problem for another discussion I think.

diff --git a/darshan-util/pydarshan/benchmarks/benchmarks/cffi_backend.py b/darshan-util/pydarshan/benchmarks/benchmarks/cffi_backend.py
index e4c074b2..539e87b5 100644
--- a/darshan-util/pydarshan/benchmarks/benchmarks/cffi_backend.py
+++ b/darshan-util/pydarshan/benchmarks/benchmarks/cffi_backend.py
@@ -1,3 +1,4 @@
+import darshan
 from darshan.backend import cffi_backend
 from darshan.log_utils import get_log_path
 
@@ -27,7 +28,15 @@ class RetrieveLogData:
 
 
     def peakmem_log_get_mounts(self, module):
-        for i in range(100000):
+        for i in range(10000):
             log = cffi_backend.log_open(self.ior_log_path)
             rec = cffi_backend.log_get_mounts(log)
             cffi_backend.log_close(log)
+
+
+    def peakmem_log_get_name_recs(self, module):
+        for i in range(10000):
+            with darshan.DarshanReport(self.ior_log_path,
+                                       read_all=True,
+                                       lookup_name_records=True) as report:
+                pass

* add some `asv` peak memory benchmarks as regression guards against darshan-hpcgh-811; the only thing missing (IMO) here to completely close darshan-hpcgh-817 is a version of the log file with `HEATMAP` in it (see the diff comments for why I don't have this for `asv` suite at the moment) * although this won't automatically run in CI because it takes too long and too much memory, it still seems to do an excellent job of being sensitive to the patched leaks over the recent history of our project: - `asv continuous -e -b "peakmem.*" b6c8599 main` ``` before after ratio [b6c8599] [3fb2c36] <treddy_memleak_cffi_only_pt2~1> <main> - 239M 76M 0.32 cffi_backend.RetrieveLogData.peakmem_log_get_record('LUSTRE') - 241M 76M 0.32 cffi_backend.RetrieveLogData.peakmem_log_get_record('H5F') - 257M 75.8M 0.30 cffi_backend.RetrieveLogData.peakmem_log_get_record('STDIO') - 289M 76M 0.26 cffi_backend.RetrieveLogData.peakmem_log_get_record('MPI-IO') - 303M 76M 0.25 cffi_backend.RetrieveLogData.peakmem_log_get_record('POSIX') - 304M 75.4M 0.25 cffi_backend.RetrieveLogData.peakmem_log_get_record('DXT_POSIX') - 304M 75.3M 0.25 cffi_backend.RetrieveLogData.peakmem_log_get_record('DXT_MPIIO') - 324M 75.8M 0.23 cffi_backend.RetrieveLogData.peakmem_log_get_record('H5D') - 23.7G 75.4M 0.00 cffi_backend.RetrieveLogData.peakmem_log_get_mounts('DXT_POSIX') - 23.7G 75.4M 0.00 cffi_backend.RetrieveLogData.peakmem_log_get_mounts('H5F') - 23.7G 75.4M 0.00 cffi_backend.RetrieveLogData.peakmem_log_get_mounts('H5D') - 23.7G 75.4M 0.00 cffi_backend.RetrieveLogData.peakmem_log_get_mounts('STDIO') - 23.7G 75.4M 0.00 cffi_backend.RetrieveLogData.peakmem_log_get_mounts('MPI-IO') - 23.7G 75.4M 0.00 cffi_backend.RetrieveLogData.peakmem_log_get_mounts('DXT_MPIIO') - 23.7G 75.4M 0.00 cffi_backend.RetrieveLogData.peakmem_log_get_mounts('LUSTRE') - 23.7G 75.4M 0.00 cffi_backend.RetrieveLogData.peakmem_log_get_mounts('POSIX') SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY. PERFORMANCE INCREASED. ```

* more reasonable memory footprint for `peakmem_log_get_mounts()` * add regression guard for darshan-hpcgh-824

tylerjereddy · 2022-10-19T15:29:33Z

With the latest revisions pushed in, I think I've dealt with my own concerns at least, and we get the extra regression case nicely covered + stay within ~laptop memory footprint now I think:

       before           after         ratio
     [b6c8599e]       [fb6bb275]
     <treddy_memleak_cffi_only_pt2~1>       <main>    
-            240M            76.2M     0.32  cffi_backend.RetrieveLogData.peakmem_log_get_record('LUSTRE')
-            241M            76.3M     0.32  cffi_backend.RetrieveLogData.peakmem_log_get_record('H5F')
-            257M            76.2M     0.30  cffi_backend.RetrieveLogData.peakmem_log_get_record('STDIO')
-            289M            76.1M     0.26  cffi_backend.RetrieveLogData.peakmem_log_get_record('MPI-IO')
-            304M            76.2M     0.25  cffi_backend.RetrieveLogData.peakmem_log_get_record('POSIX')
-            304M            75.5M     0.25  cffi_backend.RetrieveLogData.peakmem_log_get_record('DXT_MPIIO')
-            304M            75.3M     0.25  cffi_backend.RetrieveLogData.peakmem_log_get_record('DXT_POSIX')
-            324M            76.2M     0.24  cffi_backend.RetrieveLogData.peakmem_log_get_record('H5D')
-           2.57G             102M     0.04  cffi_backend.RetrieveLogData.peakmem_log_get_name_recs('DXT_MPIIO')
-           2.57G             102M     0.04  cffi_backend.RetrieveLogData.peakmem_log_get_name_recs('DXT_POSIX')
-           2.57G             102M     0.04  cffi_backend.RetrieveLogData.peakmem_log_get_name_recs('H5D')
-           2.57G             102M     0.04  cffi_backend.RetrieveLogData.peakmem_log_get_name_recs('H5F')
-           2.57G             102M     0.04  cffi_backend.RetrieveLogData.peakmem_log_get_name_recs('POSIX')
-           2.57G             102M     0.04  cffi_backend.RetrieveLogData.peakmem_log_get_name_recs('STDIO')
-           2.57G             102M     0.04  cffi_backend.RetrieveLogData.peakmem_log_get_name_recs('MPI-IO')
-           2.57G             102M     0.04  cffi_backend.RetrieveLogData.peakmem_log_get_name_recs('LUSTRE')
-           2.44G            75.6M     0.03  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('DXT_POSIX')
-           2.44G            75.6M     0.03  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('MPI-IO')
-           2.44G            75.6M     0.03  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('H5F')
-           2.44G            75.6M     0.03  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('LUSTRE')
-           2.44G            75.6M     0.03  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('DXT_MPIIO')
-           2.44G            75.6M     0.03  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('STDIO')
-           2.44G            75.6M     0.03  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('H5D')
-           2.44G            75.5M     0.03  cffi_backend.RetrieveLogData.peakmem_log_get_mounts('POSIX')

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.
PERFORMANCE INCREASED.

tylerjereddy · 2022-10-19T15:31:01Z

We may want to open an issue to remind us to add a log with runtime HEATMAP available for this benchmarking in the future, if/when it becomes easier to consume the logs repo into the (memory) benchmark suite.

shanedsnyder · 2022-10-19T16:49:34Z

Looks great, thanks @tylerjereddy.

Should we add a test case for the darshan_log_get_modules(), too? I think that's the only one not accounted for of the 4 originally reported leaks.

One other thing worth noting is that because the Python build system doesn't drive the building of the C code, the benchmarks are stuck running on a single version of the compiled C code. I think I've brought this up before... but that's a bigger problem for another discussion I think.

So, I guess that means for the results you shared that we are only considering leaks that can be resolved at the Python level, and that we don't have an easy way to do before/after with C library changes? I agree that's not ideal, as these leaks were addressed with both C and Python level changes. Feel free to defer that to another issue, particularly if you think there's some mechanisms available to help with that.

tylerjereddy · 2022-10-19T17:06:50Z

Should we add a test case for the darshan_log_get_modules()

I think I got that covered, no? log_get_record() calls log_get_generic_record which calls log_get_modules which calls libdutil.darshan_log_get_modules

Feel free to defer that to another issue, particularly if you think there's some mechanisms available to help with that.

The asv community is collecting feature requests--we could add one at airspeed-velocity/asv#1219. The current assumption is that you are dealing with a Python project, so you'd have something like pydarshan as a separate project that i.e., vendors/submodules the darshan-util C code.

Assuming that would be super annoying to do, we could have a pip-friendly build system that builds darshan-util and links it in so that pydarshan becomes more Python ecosystem friendly. But I've been hesitant to do that so far, since we'd have a second build system that is also nested and this feels like it would get confusing quickly.

shanedsnyder · 2022-10-19T17:18:46Z

Should we add a test case for the darshan_log_get_modules()

I think I got that covered, no? log_get_record() calls log_get_generic_record which calls log_get_modules which calls libdutil.darshan_log_get_modules

Oh yeah, that's right. Forgot that call was already embedded in the path for other things, so should be covered.

Feel free to defer that to another issue, particularly if you think there's some mechanisms available to help with that.

The asv community is collecting feature requests--we could add one at airspeed-velocity/asv#1219. The current assumption is that you are dealing with a Python project, so you'd have something like pydarshan as a separate project that i.e., vendors/submodules the darshan-util C code.

Assuming that would be super annoying to do, we could have a pip-friendly build system that builds darshan-util and links it in so that pydarshan becomes more Python ecosystem friendly. But I've been hesitant to do that so far, since we'd have a second build system that is also nested and this feels like it would get confusing quickly.

I see, yeah, that does sound tricky (at least to me). I don't think it's particularly high priority, either, so I'm fine with just punting for now.

tylerjereddy added the performance label Sep 27, 2022

github-actions bot added the pydarshan label Sep 27, 2022

tylerjereddy commented Sep 27, 2022

View reviewed changes

tylerjereddy added 2 commits October 19, 2022 09:10

MAINT: PR 821 revisions

efd6123

* more reasonable memory footprint for `peakmem_log_get_mounts()` * add regression guard for darshan-hpcgh-824

tylerjereddy force-pushed the treddy_peakmem_bench_leaks branch from 4c0f192 to efd6123 Compare October 19, 2022 15:28

shanedsnyder approved these changes Oct 19, 2022

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BENCH: profile memleaks asv #821

BENCH: profile memleaks asv #821

tylerjereddy commented Sep 27, 2022

tylerjereddy Sep 27, 2022

tylerjereddy Sep 27, 2022

tylerjereddy commented Oct 3, 2022

tylerjereddy commented Oct 13, 2022

tylerjereddy commented Oct 19, 2022

tylerjereddy commented Oct 19, 2022

shanedsnyder commented Oct 19, 2022

tylerjereddy commented Oct 19, 2022

shanedsnyder commented Oct 19, 2022



		def peakmem_log_get_mounts(self, module):
		for i in range(100000):

BENCH: profile memleaks asv #821

Are you sure you want to change the base?

BENCH: profile memleaks asv #821

Conversation

tylerjereddy commented Sep 27, 2022

tylerjereddy Sep 27, 2022

Choose a reason for hiding this comment

tylerjereddy Sep 27, 2022

Choose a reason for hiding this comment

tylerjereddy commented Oct 3, 2022

tylerjereddy commented Oct 13, 2022

tylerjereddy commented Oct 19, 2022

tylerjereddy commented Oct 19, 2022

shanedsnyder commented Oct 19, 2022

tylerjereddy commented Oct 19, 2022

shanedsnyder commented Oct 19, 2022