Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New and Improved MapFusion #1629

Open
wants to merge 125 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 121 commits
Commits
Show all changes
125 commits
Select commit Hold shift + click to select a range
aa433fe
Started with a first version of the map fusion stuff.
philip-paul-mueller Aug 22, 2024
71a88a1
Made some stylistic modification to teh code.
philip-paul-mueller Aug 23, 2024
bc87ddb
Added a function for estimating if something is pointwhise.
philip-paul-mueller Aug 23, 2024
497a2d6
Now there is an error in the actuall rewiering stuff.
philip-paul-mueller Aug 23, 2024
9e36447
Fixed a bug in the map fusion.
philip-paul-mueller Aug 23, 2024
7a48e0d
Made some formating changes.
philip-paul-mueller Aug 23, 2024
d609045
Updated the tests of the map fusion.
philip-paul-mueller Aug 23, 2024
52c4542
WIP: Started with a renamer function.
philip-paul-mueller Aug 23, 2024
3b758bf
Continued with the parallel fusion stuff.
philip-paul-mueller Aug 28, 2024
377b428
The fusion transformation now also checks if there is a write conflic…
philip-paul-mueller Aug 28, 2024
db4864b
Updated some tests.
philip-paul-mueller Aug 28, 2024
f395acd
Fixed an error. I shouild refactor that damn loop.
philip-paul-mueller Aug 28, 2024
b1ab95e
Some improvements to the tests.
philip-paul-mueller Aug 28, 2024
945ca8f
Removed some debugging stuff.
philip-paul-mueller Aug 29, 2024
940b9b6
Fixed some typing stuff.
philip-paul-mueller Aug 29, 2024
ecae361
Started with a better implementation for the data dependency test.
philip-paul-mueller Aug 29, 2024
64d07fd
First version of the pointwise checker in the map fusion.
philip-paul-mueller Aug 29, 2024
33a0edf
Updated some test cases.
philip-paul-mueller Aug 29, 2024
dbb989e
Changed how the `_find_subsets()` works in the dependency tests.
philip-paul-mueller Aug 29, 2024
e142881
Updated the map fusion's partitioning function.
philip-paul-mueller Aug 30, 2024
ff018f4
The shared data cache can not be dumped.
philip-paul-mueller Aug 30, 2024
ec6339a
Reworked how the serial fusion adjuists Memlets.
philip-paul-mueller Aug 30, 2024
9267ea9
Buffer tiling now finally works.
philip-paul-mueller Aug 30, 2024
fc2db8a
The Mapreduce now also works.
philip-paul-mueller Aug 30, 2024
4d9f11d
Added a test to the map fusion stuff that ensures that the shared blo…
philip-paul-mueller Aug 30, 2024
2b91465
Added a test for the indirect accesses case.
philip-paul-mueller Aug 30, 2024
73f4415
Updated the heat 3d test. It now ensures that the fusion is now done.
philip-paul-mueller Aug 30, 2024
94ecd19
Fixed an error in the parallel map fusion.
philip-paul-mueller Aug 30, 2024
3f3f8a3
Added a test for the parallel map fusion transformations.
philip-paul-mueller Aug 30, 2024
c23ed39
Fixed non proper cycle detection.
philip-paul-mueller Aug 30, 2024
dad61cb
Modified how the pre exit Memlet (the Memlet that writes in the new i…
philip-paul-mueller Sep 2, 2024
a57ebb3
Modified how the partition function works.
philip-paul-mueller Sep 2, 2024
439d6f3
Modified the map fusion tests.
philip-paul-mueller Sep 2, 2024
13c80ec
More towards bug compatibility.
philip-paul-mueller Sep 2, 2024
12a5cf7
The `ssa_sdfg` parameter was noit named properly.
philip-paul-mueller Sep 2, 2024
995ef4d
Auto optimize now uses map fusion with a strict dataflow.
philip-paul-mueller Sep 2, 2024
dc4ed31
Fixed the classification function for shared transient.
philip-paul-mueller Sep 2, 2024
eb48391
Ensure that we have one state in the fusion test.
philip-paul-mueller Sep 2, 2024
d53302d
Fixed a problem in the classification of the shared transients.
philip-paul-mueller Sep 2, 2024
3f54e1f
Updated the rw conflict code a little bit.
philip-paul-mueller Sep 2, 2024
0f46c1c
Updated the `SDFGState._read_and_write_set()` function.
philip-paul-mueller Sep 4, 2024
0585112
Extend the range of `strict_dataflow`.
philip-paul-mueller Sep 4, 2024
61160e4
Fixed a bug in the shared mode handling.
philip-paul-mueller Sep 4, 2024
584635a
Misc changes to the serial fusion.
philip-paul-mueller Sep 4, 2024
b2dea1d
Fixed a wrong test.
philip-paul-mueller Sep 4, 2024
fba6682
Removed the strange nested SDFG check.
philip-paul-mueller Sep 4, 2024
62b0288
Made some modifications to the map fusion stuff.
philip-paul-mueller Sep 4, 2024
8858021
Refined my fix in the `_read_and_write_set()` function.
philip-paul-mueller Sep 5, 2024
09deb95
Made some fixes to the _read_and_write_sets function.
philip-paul-mueller Sep 5, 2024
1e14f26
Updated the _read_and_write_sets() function.
philip-paul-mueller Sep 5, 2024
92c7097
Added a new test for the SDFG fusion.
philip-paul-mueller Sep 5, 2024
041b3da
Fixed how the interstate stuff works.
philip-paul-mueller Sep 5, 2024
5556bd3
Fixed an error.
philip-paul-mueller Sep 5, 2024
e3461c3
How did that thing even run.
philip-paul-mueller Sep 5, 2024
6f5edc5
Changed the default value of the `strict_dataflow` flag (the compabil…
philip-paul-mueller Sep 5, 2024
35a5426
Fixed the _read_and_write_sets().
philip-paul-mueller Sep 5, 2024
20b728e
Updated `_read_and_write_sets()`.
philip-paul-mueller Sep 5, 2024
5dd9c6b
Refined the test in the memlet extension of _read_and_write_sets.
philip-paul-mueller Sep 5, 2024
617fb8f
Now the `transformations/move_loop_into_map_test.py::MoveLoopIntoMapT…
philip-paul-mueller Sep 5, 2024
04cadbe
If there are no in edges, then we should not add it.
philip-paul-mueller Sep 5, 2024
d8b3547
Removed some stray view calls.
philip-paul-mueller Sep 5, 2024
8fed4fe
Updated some checks in the _read_and_write_sets function.
philip-paul-mueller Sep 6, 2024
567f459
Fixed the bug in `_read_and_write_sets()` that made `tests/numpy/ufun…
philip-paul-mueller Sep 6, 2024
9cad08d
Added a test to back my claims I did in `567f459e`.
philip-paul-mueller Sep 6, 2024
866d815
Added a new test for map fusion that tests indirect accesses.
philip-paul-mueller Sep 6, 2024
a5846b5
Reworked the serial map fusion and related files.
philip-paul-mueller Sep 6, 2024
7ccdd9c
Made a rename.
philip-paul-mueller Sep 6, 2024
d4041b7
Applied the formating.
philip-paul-mueller Sep 6, 2024
a444992
Merge branch 'master' into new-map-fusion
philip-paul-mueller Sep 9, 2024
a023f7c
Updated the comment about the wrong filter check in `SDFGState._read_…
philip-paul-mueller Sep 9, 2024
5c49eee
Removed the wrong check in `SDFGState._read_and_write_sets()`, see al…
philip-paul-mueller Sep 9, 2024
63e78c9
Had to reenable the check in `SDFGState._read_and_write_sets()` is di…
philip-paul-mueller Sep 9, 2024
896ac68
Modified the `shared_data` attribute of teh `MapFusionHelper`.
philip-paul-mueller Sep 11, 2024
33f9fdd
Merge remote-tracking branch 'spcl/master' into new-map-fusion
philip-paul-mueller Sep 11, 2024
fcffb22
This compute offset function seems to solve all my problems.
philip-paul-mueller Sep 12, 2024
0ddb3c2
Added a test for the special case.
philip-paul-mueller Sep 12, 2024
05ffee4
Did some cleanup.
philip-paul-mueller Sep 12, 2024
6de85c7
Merge remote-tracking branch 'spcl/master' into new-map-fusion
philip-paul-mueller Sep 12, 2024
8c86662
Specified how the corrector function of the offsets works.
philip-paul-mueller Sep 13, 2024
dfc92e7
Merge branch 'master' into new-map-fusion
philip-paul-mueller Sep 23, 2024
11a3167
UPdated some comments.
philip-paul-mueller Sep 26, 2024
44cf6ad
Added more comments.
philip-paul-mueller Sep 26, 2024
914d67b
Merge branch 'main' into new-map-fusion
philip-paul-mueller Oct 31, 2024
5e25816
Removed the parallel map fusion transformation.
philip-paul-mueller Nov 1, 2024
259d17c
Added a new test.
philip-paul-mueller Nov 1, 2024
db26320
Fixed a missing include.
philip-paul-mueller Nov 1, 2024
fa67492
Revert "Added a new test."
philip-paul-mueller Nov 1, 2024
3453c6c
It seems that I have removed a test.
philip-paul-mueller Nov 1, 2024
90731af
Realized that I can not use `SDFG.shared_transient()` for detection i…
philip-paul-mueller Nov 1, 2024
f659cd8
Removed teh specification of the intermediate.
philip-paul-mueller Nov 15, 2024
244e3ea
Merge remote-tracking branch 'spcl/main' into new-map-fusion
philip-paul-mueller Dec 3, 2024
11a509e
Updated the strict dataflow mode.
philip-paul-mueller Dec 3, 2024
724d1f5
No longer explicitly specify strict dataflow mode.
philip-paul-mueller Dec 3, 2024
10ae1e2
Fixed a typo that was introduced in commit e1daf32fc8.
philip-paul-mueller Dec 4, 2024
d909379
Forgot to make strict dataflow the default.
philip-paul-mueller Dec 4, 2024
cc7324b
Added a new check to the map fusion in case of shared intermediates.
philip-paul-mueller Dec 4, 2024
6429d91
Chacnged the test.
philip-paul-mueller Dec 4, 2024
3d1cd9e
Realiced that there is no problem with a data race.
philip-paul-mueller Dec 4, 2024
a412394
Added more tests to the map fusion.
philip-paul-mueller Dec 4, 2024
fa21bd3
Added a new test for the map fusion.
philip-paul-mueller Dec 5, 2024
6e13941
Merge remote-tracking branch 'spcl/main' into new-map-fusion
philip-paul-mueller Dec 5, 2024
d8da3c6
WIP: Started with implementing Phil's suggestions.
philip-paul-mueller Dec 13, 2024
e2285f0
Made some modification, time to save.
philip-paul-mueller Dec 16, 2024
4832c3c
Updated the map fusion test a little bit.
philip-paul-mueller Dec 16, 2024
fd3b48a
Added a test for the next generation of MapFusion.
philip-paul-mueller Dec 16, 2024
8fa7cb2
Added a new test.
philip-paul-mueller Dec 16, 2024
9139149
The test `test_fusion_with_nested_sdfg_0` is now explicitly construct…
philip-paul-mueller Dec 16, 2024
0a5aeaf
Allowed that consumer edge in MapFusion are dynamic.
philip-paul-mueller Dec 16, 2024
e2bc10d
Changed the doc string to the Sphinx one.
philip-paul-mueller Dec 16, 2024
aa3619f
Fixed some missing test.
philip-paul-mueller Dec 16, 2024
a740d16
Updated the description of the transformation.
philip-paul-mueller Dec 16, 2024
d07e2c5
Added a new test.
philip-paul-mueller Dec 16, 2024
fbc8469
Merge remote-tracking branch 'spcl/main' into new-map-fusion
philip-paul-mueller Dec 17, 2024
5c354c6
Fixed an iteration bug.
philip-paul-mueller Dec 17, 2024
abf739c
Fixed the problem in the heat test.
philip-paul-mueller Dec 17, 2024
2b17111
Added a new test.
philip-paul-mueller Dec 17, 2024
1e8f66f
Merge remote-tracking branch 'spcl/main' into new-map-fusion
philip-paul-mueller Dec 17, 2024
3ab46d8
Added a flag to MapFusion that allows to consider everything as shared.
philip-paul-mueller Dec 17, 2024
b1fc9d1
Updated how the memlet adjustment works, this should be a bit more li…
philip-paul-mueller Dec 17, 2024
fdc6424
Added a new test to check the memlet update.
philip-paul-mueller Dec 17, 2024
e2c41b5
Merge branch 'main' into new-map-fusion
phschaad Jan 13, 2025
bcaed23
Centralized the map fusion call in the testing.
philip-paul-mueller Jan 17, 2025
243611d
Added a test that ensures that no cycles would be created.
philip-paul-mueller Jan 17, 2025
c53f939
Found a case that was not handled.
philip-paul-mueller Jan 17, 2025
a3842f9
Added more tests to the map fusion and refined some others.
philip-paul-mueller Jan 17, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions dace/subsets.py
Original file line number Diff line number Diff line change
Expand Up @@ -80,7 +80,7 @@ def covers(self, other):
# Subsets of different dimensionality can never cover each other.
if self.dims() != other.dims():
return ValueError(
f"A subset of dimensionality {self.dim()} cannot test covering a subset of dimensionality {other.dims()}"
f"A subset of dimensionality {self.dims()} cannot test covering a subset of dimensionality {other.dims()}"
)

if not Config.get('optimizer', 'symbolic_positive'):
Expand All @@ -106,7 +106,7 @@ def covers_precise(self, other):
# Subsets of different dimensionality can never cover each other.
if self.dims() != other.dims():
return ValueError(
f"A subset of dimensionality {self.dim()} cannot test covering a subset of dimensionality {other.dims()}"
f"A subset of dimensionality {self.dims()} cannot test covering a subset of dimensionality {other.dims()}"
)

# If self does not cover other with a bounding box union, return false.
Expand Down
10 changes: 8 additions & 2 deletions dace/transformation/auto/auto_optimize.py
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,10 @@ def greedy_fuse(graph_or_subgraph: GraphViewType,
# If we have an SDFG, recurse into graphs
graph_or_subgraph.simplify(validate_all=validate_all)
# MapFusion for trivial cases
graph_or_subgraph.apply_transformations_repeated(MapFusion, validate_all=validate_all)
graph_or_subgraph.apply_transformations_repeated(
MapFusion(strict_dataflow=True),
validate_all=validate_all,
)

# recurse into graphs
for graph in graph_or_subgraph.nodes():
Expand All @@ -76,7 +79,10 @@ def greedy_fuse(graph_or_subgraph: GraphViewType,
sdfg, graph, subgraph = None, None, None
if isinstance(graph_or_subgraph, SDFGState):
sdfg = graph_or_subgraph.parent
sdfg.apply_transformations_repeated(MapFusion, validate_all=validate_all)
sdfg.apply_transformations_repeated(
MapFusion(strict_dataflow=True),
validate_all=validate_all,
)
graph = graph_or_subgraph
subgraph = SubgraphView(graph, graph.nodes())
else:
Expand Down
8 changes: 7 additions & 1 deletion dace/transformation/dataflow/buffer_tiling.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,13 @@ def apply(self, graph, sdfg):

# Fuse maps
some_buffer = next(iter(buffers)) # some dummy to pass to MapFusion.apply_to()
MapFusion.apply_to(sdfg, first_map_exit=tile_map1_exit, array=some_buffer, second_map_entry=tile_map2_entry)
MapFusion.apply_to(
sdfg,
first_map_exit=tile_map1_exit,
array=some_buffer,
second_map_entry=tile_map2_entry,
verify=True,
)

# Optimize the simple cases
map1_entry.range.ranges = [
Expand Down
2,175 changes: 1,682 additions & 493 deletions dace/transformation/dataflow/map_fusion.py

Large diffs are not rendered by default.

1 change: 1 addition & 0 deletions tests/buffer_tiling_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -78,6 +78,7 @@ def _semantic_eq(tile_sizes, program):

count = sdfg.apply_transformations(BufferTiling, options={'tile_sizes': tile_sizes})
assert count > 0
sdfg.validate()
sdfg(w3=w3, w5=w5, A=A, B=B2, I=A.shape[0], J=A.shape[1])

assert np.allclose(B1, B2)
Expand Down
3 changes: 2 additions & 1 deletion tests/npbench/polybench/correlation_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -83,7 +83,8 @@ def run_correlation(device_type: dace.dtypes.DeviceType):
# Compute ground truth and validate result

corr_ref = ground_truth(M, float_n_ref, data_ref)
assert np.allclose(corr, corr_ref)
diff = corr_ref - corr
assert np.abs(diff).max() <= 10e-10
phschaad marked this conversation as resolved.
Show resolved Hide resolved
return sdfg


Expand Down
12 changes: 12 additions & 0 deletions tests/npbench/polybench/heat_3d_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -64,10 +64,22 @@ def run_heat_3d(device_type: dace.dtypes.DeviceType):
A_ref = np.copy(A)
B_ref = np.copy(B)

def count_maps(sdfg: dc.SDFG) -> int:
nb_maps = 0
for _, state in sdfg.all_nodes_recursive():
node: dc.SDFGState
for node in state.nodes():
if isinstance(node, dc.sdfg.nodes.MapEntry):
nb_maps += 1
return nb_maps

if device_type in {dace.dtypes.DeviceType.CPU, dace.dtypes.DeviceType.GPU}:
# Parse the SDFG and apply auto-opt
sdfg = heat_3d_kernel.to_sdfg()
initial_maps = count_maps(sdfg)
sdfg = auto_optimize(sdfg, device_type)
after_maps = count_maps(sdfg)
assert after_maps < initial_maps, f"Expected less maps, initially {initial_maps} many maps, but after optimization {after_maps}"
sdfg(TSTEPS, A, B, N=N)
elif device_type == dace.dtypes.DeviceType.FPGA:
# Parse SDFG and apply FPGA friendly optimization
Expand Down
1 change: 1 addition & 0 deletions tests/npbench/polybench/jacobi_2d_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,6 +47,7 @@ def run_jacobi_2d(device_type: dace.dtypes.DeviceType):
# Parse the SDFG and apply autopot
sdfg = kernel.to_sdfg()
sdfg = auto_optimize(sdfg, device_type)

sdfg(A=A, B=B, TSTEPS=TSTEPS, N=N)

elif device_type == dace.dtypes.DeviceType.FPGA:
Expand Down
2 changes: 1 addition & 1 deletion tests/python_frontend/fields_and_global_arrays_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -585,7 +585,7 @@ def caller():

# Ensure only three globals are created
sdfg = caller.to_sdfg()
assert len([k for k in sdfg.arrays if '__g' in k]) == 3
assert len([k for k in sdfg.arrays if k.startswith('__g')]) == 3


def test_two_inner_methods():
Expand Down
9 changes: 7 additions & 2 deletions tests/transformations/apply_to_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,6 +15,7 @@ def dbladd(A: dace.float64[100, 100], B: dace.float64[100, 100]):
dbl = B
return A + dbl * B


@dace.program
def unfusable(A: dace.float64[100], B: dace.float64[100, 100]):
"""Test function of two maps that can not be fused."""
Expand Down Expand Up @@ -57,8 +58,12 @@ def test_applyto_pattern():
transient = next(aname for aname, desc in sdfg.arrays.items() if desc.transient)
access_node = next(n for n in state.nodes() if isinstance(n, dace.nodes.AccessNode) and n.data == transient)

assert MapFusion.can_be_applied_to(sdfg, first_map_exit=mult_exit, array=access_node, second_map_entry=add_entry)

assert MapFusion.can_be_applied_to(
sdfg,
first_map_exit=mult_exit,
array=access_node,
second_map_entry=add_entry
)
MapFusion.apply_to(sdfg, first_map_exit=mult_exit, array=access_node, second_map_entry=add_entry)

assert len([node for node in state.nodes() if isinstance(node, dace.nodes.MapEntry)]) == 1
Expand Down
31 changes: 29 additions & 2 deletions tests/transformations/mapfusion_data_races_test.py
Original file line number Diff line number Diff line change
Expand Up @@ -41,6 +41,13 @@ def rw_data_race_3(A: dace.float64[20], B: dace.float64[20]):
A[:10] += 3.0 * offset(A[:11])


@dace.program
def rw_data_race_4(A: dace.float64[20], B: dace.float64[20]):
# This is potentially fusable
A += B
A *= 2.0


def test_rw_data_race():
sdfg = rw_data_race.to_sdfg(simplify=True)
sdfg.apply_transformations_repeated(MapFusion)
Expand All @@ -50,8 +57,9 @@ def test_rw_data_race():

def test_rw_data_race_2_mf():
sdfg = rw_data_race_2.to_sdfg(simplify=True)
sdfg.apply_transformations_repeated(MapFusion)
nb_applied = sdfg.apply_transformations_repeated(MapFusion)
map_entry_nodes = [n for n, _ in sdfg.all_nodes_recursive() if isinstance(n, nodes.MapEntry)]
assert nb_applied > 0
assert (len(map_entry_nodes) > 1)


Expand All @@ -69,8 +77,27 @@ def test_rw_data_race_3_sgf():
assert (len(map_entry_nodes) > 1)


def test_rw_data_race_3_mf():
sdfg = rw_data_race_3.to_sdfg(simplify=True)
nb_applied = sdfg.apply_transformations_repeated(MapFusion)
map_entry_nodes = [n for n, _ in sdfg.all_nodes_recursive() if isinstance(n, nodes.MapEntry)]
assert (len(map_entry_nodes) > 1)
assert nb_applied > 0


def test_rw_data_race_4_mf():
# It is technically possible to fuse it, because there is only a point wise dependency.
# However, it is very hard to detect and handle correct.
sdfg = rw_data_race_4.to_sdfg(simplify=True)
sdfg.apply_transformations_repeated(MapFusion)
map_entry_nodes = [n for n, _ in sdfg.all_nodes_recursive() if isinstance(n, nodes.MapEntry)]
assert (len(map_entry_nodes) >= 1)


if __name__ == "__main__":
test_rw_data_race()
test_rw_data_race_2_mf()
test_rw_data_race_2_sgf()
test_rw_data_race_2_mf()
test_rw_data_race_3_sgf()
test_rw_data_race_3_mf()
test_rw_data_race_4_mf()
Loading
Loading