[TTIR][TTNN] MLIR compiler locations #1745

sdjordjevicTT · 2025-01-10T09:50:33Z

Create a hierarchy of MLIR locations during the passes that decompose and convert MLIR operations, enabling tracing back through the graphs for easier debugging.

tapspatel · 2025-01-10T23:46:43Z

Design Considerations

Every children op needs a way to know whom are its parents
Every parent op needs a way to know whom are its children
Need to know the lowest level ttnn operation that produces the result for any higher level dialect operation
For example, if dialect.op1 decomposes into [ttnn.subop1, ttnn.subop2, ttnn.deallocate, ttnn.subop3, ttnn.deallocate], need to know what is the actual child op which corresponds to the final computation result

What can passes do?

Passes can decompose an existing parent op into 1 or more children ops
Passes can fuse two or more existing parent ops into 1 or more children ops
Passes can add new ops (child with no parent)
Passes can remove existing parent ops

Here is a potential location naming design I was thinking about
Operations

If a pass translates an op into 1 or more children ops, each of those children ops inherit the loc of the parent
If a pass fuses two or more existing parent ops into 1 or more children ops
- If the parent ops have the same loc, children ops inherit the same loc
- If the parent ops have different locs, children ops inherit a combination of all the parent locs
If a pass adds new ops, ops will set their loc to some representation of the pass that added it

Inputs

If an op is inserted for an input op, it's loc will be set to the name of the input op

Outputs

If an op is inserted for an output op, it's loc will be set to the name of the input op

Special Operations
The following special operations will always set their loc to the pass that adds it

ToDevice
FromDevice
Deallocate

For an example, currently in the compiler, if we had the following sigmoid.mlir ttir module. Each op was generated in ttir and had a location set to it

module {
  func.func @test_sigmoid(%arg0: tensor<128x128xf32>) -> tensor<128x128xf32> {
    %0 = tensor.empty() : tensor<128x128xf32>
    %1 = "ttir.sigmoid"(%arg0, %0) <{operandSegmentSizes = array<i32: 1, 1>}> : (tensor<128x128xf32>, tensor<128x128xf32>) -> tensor<128x128xf32>
    return %1 : tensor<128x128xf32>
  }
}

It's decomposition (with loc data) looks like the following

The design proposal would change it to the following

tapspatel · 2025-01-13T22:56:18Z

I added some useful code here tpatel/issue-1745

In golden_generator.py, you can run test_relu which will generate an mlir file from python infra (other ops also pybinded). def print_module(module): will print the module including the location data set within the module for each op. All passes have also been pybinded (see test_relu_decomp).

tapspatel · 2025-01-13T22:57:55Z

You can see the total passes that are run on an mlir file via

ttmlir-opt --ttir-to-ttnn-backend-pipeline="system-desc-path=/code/tt-mlir/ttrt-artifacts/system_desc.ttsys" test_relu_ttir.mlir --dump-pass-pipeline     
Pass Manager with 12 passes:
builtin.module(ttir-to-ttir-decomposition,inline{default-pipeline=canonicalize inlining-threshold=4294967295 max-iterations=4 },ttir-load-system-desc{path=/code/tt-mlir/ttrt-artifacts/system_desc.ttsys},ttir-implicit-device{},ttir-broadcast-fold,ttnn-layout,convert-ttir-to-ttnn,remove-dead-values,ttnn-workaround{ttnn-enable-decomposition-workaround-pass=true ttnn-enable-layout-workaround-pass=true},canonicalize{  max-iterations=10 max-num-rewrites=-1 region-simplify=normal test-convergence=false top-down=true},ttnn-decompose-layouts,ttnn-deallocate)

All of these are pybinded into golden_generator.py to help see what the output of each decomposed pass is and what the op loc data looks like

odjuricicTT · 2025-01-16T14:20:32Z

I'll add some requirements from optimizer and tt-explorer side:

Op locations should be unique.
Optimizer overrides use op location strings to identify a specific op. This does not work if multiple ops have the same location.
Frontends should be able to pass multiple levels of location. E.g. llama op comes from "layer_1", "attention_module", "matmul_1". This info exists on forge-fe and is needed in order to be able to visualize the graph in tt-explorer.

Tho we might be stretching the usage of Locations for this, i'm open to using something different in the long run if it makes sense.

@sdjordjevicTT @tapspatel @azecevicTT

sdjordjevicTT · 2025-01-17T15:46:40Z

Pasting the image from our zoom whiteboard.

azecevicTT · 2025-01-21T10:25:39Z

@tapspatel I've synced offline with @odjuricicTT, we agreed that IDs in location might be a bit of a stretch and that attributes that are added in some pass might be a better place for them.

Regarding your proposal, I will just reiterate through it with some implementation details, so we don't miss something before implementation.

If a pass translates an op into 1 or more children ops, each of those children ops inherit the loc of the

This is a 'default' case that we have right now in most (if not all) places. In the case of decomposition, the result of the last op in the chain of new ops should always be the result of a decomposed op, so I believe there isn't ambiguity in this case.

If a pass fuses two or more existing parent ops into 1 or more children ops
-If the parent ops have the same loc, children ops inherit the same loc
-If the parent ops have different locs, children ops inherit a combination of all the parent locs

For the second point, we can use built-in FusedLoc https://mlir.llvm.org/docs/Dialects/Builtin/#fusedloc. The question that remains is does the order in which they appear in FusedLoc matters, i.e. linear(a, b, c) = add(mamtul(a, b), c), so the loc(linear) = FusedLoc([loc(add), loc(matmul)]). If you want to trace it back it seems that order is important, but it will be the same (or reverse, depending on the way you look at it) as the order of original ops in IR.
For the first point, is this something that's functionally important for your use-case, or can you still trace back, with FusedLoc even when the locations of parents are the same?

If a pass adds new ops, ops will set their loc to some representation of the pass that added it

There is built-in NameLoc https://mlir.llvm.org/docs/Dialects/Builtin/#nameloc, that seems suitable for this case. We use it in some other places as well so it can become ambiguous, so my proposal is extend this class with something like PassOpsLoc (naming proposals are welcome), where we would set the name to the name of the pass that has added an op, and childLoc to loc that we are currently using. This way we can use RTTI to query information about the pass that has added an op. Can you confirm that this would be possible to do with Python bindings?

Inputs
If an op is inserted for an input op, it's loc will be set to the name of the input op
Outputs
If an op is inserted for an output op, it's loc will be set to the name of the input op

Can you elaborate more on this, I'm not sure I'm getting the point here. Ultimately frontend that lowers to TTIR sets the 'starting' location.

Special Operations
The following special operations will always set their loc to the pass that adds it
ToDevice
FromDevice
Deallocate

Do we still have to consider them special if we add the aforementioned PassOpsLoc?

tapspatel · 2025-01-22T01:24:42Z

discussed questions with @azecevicTT offline

This was referenced Jan 14, 2025

LLama prefill bringup #1768

Open

Have ability to load a llama model in ttir builder #1779

Open

azecevicTT self-assigned this Jan 18, 2025

tapspatel mentioned this issue Jan 21, 2025

Model Accuracy Overlay - TT-Explorer #1234

Open

azecevicTT mentioned this issue Jan 21, 2025

Location with multiple levels #1915

Open

azecevicTT mentioned this issue Jan 27, 2025

MLIR Op locations #1972

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TTIR][TTNN] MLIR compiler locations #1745

[TTIR][TTNN] MLIR compiler locations #1745

sdjordjevicTT commented Jan 10, 2025

tapspatel commented Jan 10, 2025

tapspatel commented Jan 13, 2025

tapspatel commented Jan 13, 2025

odjuricicTT commented Jan 16, 2025

sdjordjevicTT commented Jan 17, 2025

azecevicTT commented Jan 21, 2025

tapspatel commented Jan 22, 2025

[TTIR][TTNN] MLIR compiler locations #1745

[TTIR][TTNN] MLIR compiler locations #1745

Comments

sdjordjevicTT commented Jan 10, 2025

tapspatel commented Jan 10, 2025

tapspatel commented Jan 13, 2025

tapspatel commented Jan 13, 2025

odjuricicTT commented Jan 16, 2025

sdjordjevicTT commented Jan 17, 2025

azecevicTT commented Jan 21, 2025

tapspatel commented Jan 22, 2025