-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[tensor-layout] implements LayoutAttr::getStride() #185
Conversation
Initial implementation of stride calculation from the logical shape and the `LayoutAttr`. The strides of the tensors are stored in the Flatbuffer and are used by FE to allocate the tensors.
@nsmithtt should we also modify |
AffineExpr constantExpr = expr.replaceDims(logicalShapeExprs); | ||
std::int64_t constant = | ||
llvm::cast<AffineConstantExpr>(constantExpr).getValue() + 1; | ||
physicalShape[i] = constant; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This value needs to be aligned up to be a multiple of the respective grid dimension.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, above diagram is a bit confusing, it calls "physical shape" something different from how you've defined your function here, but I think that's OK. I'm actually not sure that physical tensor shape the way the tool has defined it is useful, we can redefine this in the document as a follow on, because I like your interpretation of physical shape better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we are misaligned on what "stride" should mean. I've taken the plain definition of stride as a way to map from logical space of tensor into the contiguous physical memory. In that context, main use for this would be that FE can allocate the memory in advance and know how to interpret it.
Once cores are taken into account and there is no contiguous memory, I'm not sure how to interpret stride? What will the context be in which that stride is used? Do we need a different terminology for this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we have the same definition of stride, but stride doesn't necessarily mean that the data must be tightly packed. For example, you might have 30 scalars per row, but want every tensor row to start on an address multiple of 32, this would mean you need to program a stride of 32, not 30.
Stride here does do a bit of double duty, because it's constrained in such a way that it guarantees divisibility of the grid on the respective dimension. This is how we can infer from how the shard shape should look in programming the memref. I should write additional documentation for this point, but it is touched on in the padding section: https://tenstorrent.github.io/tt-mlir/specs/tensor-layout.html#padding
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was thinking about this a bit more this afternoon. All of my points are a bit moot until we have ttnn tensor layout support and the ability to do zero copy. For now you could pretty much flexibly program stride however you want and it'll work because we have to bounce the tensor through ttnn cpu tensor anyway.
We can check this in as is for now and once ttnn layout is in place I'll file an issue for zero copy support and then we'll need the strides to be tightly coupled to what the compiler and device expect.
No it's OK, ttnn doesn't actually have full support for this style layout yet. We'll update runtime once their support lands. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, we'll revisit stride programming when we can actually leverage it for zero copy / runtime stitching.
commit 5ebfe37 Author: Sasa Vuckovic <[email protected]> Date: Mon Jul 29 16:42:29 2024 +0200 Small refactor to ttnn-to-emitc pass (#229) * small refactor to ttnn-to-emitc pass * Remove unused constructor in DefaultOpConversionPattern * run pre-commit hooks that didn't get picked up * remove PassDetail.h commit 60ad1a8 Author: Usman Aziz <[email protected]> Date: Mon Jul 29 10:29:46 2024 -0400 Update flatbuffers dependency to allow bezel builds without python dependency. (#233) commit 47a54b1 Author: Nikola Obradovic <[email protected]> Date: Mon Jul 29 12:45:28 2024 +0200 [Optimizer] Setting up rough shape of optimizer pass. Expanding analysis. (#236) commit 29e618e Author: Nick Smith <[email protected]> Date: Sat Jul 27 15:24:27 2024 -0700 embed sdk in ccache (#237) commit 91e6b15 Author: Nick Smith <[email protected]> Date: Fri Jul 26 19:51:25 2024 -0700 Fix mac build; link against MLIR dylib (#239) commit d817a8f Author: Radenko Pavlovic <[email protected]> Date: Fri Jul 26 14:00:45 2024 +0200 Change TOSA->TTIR to use dialect conversion API (#230) Fixes #178 Added conversion lib that implements conversion from TOSA to TTIR. Did not add new coverage of TOSA ops, it still supports only add, mul, sub, geq ops. Type conversion is still a no-op. Removed old conversion. Added MLIRTTConversions lib that links all conversion libs and is linked against TTMLIR. commit 39ca8e2 Author: Nikola Obradovic <[email protected]> Date: Fri Jul 26 09:15:16 2024 +0200 [Misc] Rename TTIR op interface to TTIROp. (#227) commit 01c19bd Author: Ognjen Djuricic <[email protected]> Date: Thu Jul 25 09:54:25 2024 +0200 Change ttir->ttnn to use dialect conversion api (#177) (#221) commit 1fcda08 Author: Nick Smith <[email protected]> Date: Wed Jul 24 12:12:42 2024 -0700 Add `tt.device` spec #91 (#213) commit a2f8e01 Author: Predrag Ilkic <[email protected]> Date: Wed Jul 24 19:49:25 2024 +0200 [docs] fix build on linux (#224) It looks like the `cp -r` command behaves differently on linux vs macOS. `cp -r dir/ new_dir/` will copy contents of the `dir/` into `new_dir/` on macOS, but on Linux, it will copy the directory, so you would end up with `new_dir/dir/`. Hence, changing the command to be: `cp -r dir/* new_dir/` which works the same on both platforms. Updating the docs for building docs as well. commit 7345e48 Author: Milan Topalovic <[email protected]> Date: Wed Jul 24 12:11:54 2024 +0200 Adding translation from ttnn to flatbuffer (#216) We can now use ttmlir-translate tool to translate from TTNN IR to flatbuffer. Example: `./build/bin/ttmlir-opt --ttir-to-ttnn-backend-pipeline test/ttmlir/Dialect/TTNN/simple_matmul.mlir | ./build/bin/ttmlir-translate --ttnn-to-flatbuffer -o out.ttnn` To get flatbuffer from tt-forge we call std::vector<uint8_t> ttnnToFlatbuffer(Operation *op). resolves #120 commit 5cc95a4 Author: Stefan Djordjevic <[email protected]> Date: Wed Jul 24 10:25:56 2024 +0200 Rewriting pipelines and transformation passes headers in LLVM style (#219) commit a46f4f1 Author: Predrag Ilkic <[email protected]> Date: Tue Jul 23 18:14:09 2024 +0200 [tensor-layout] implements LayoutAttr::getStride() (#185) Initial implementation of stride calculation from the logical shape and the `LayoutAttr`. The strides of the tensors are stored in the Flatbuffer and are used by FE to allocate the tensors. commit 6841e4d Author: Nikola Obradovic <[email protected]> Date: Tue Jul 23 09:50:31 2024 +0200 [Optimizer] Per op grid overrides. (#206) commit 2574010 Author: Radenko Pavlovic <[email protected]> Date: Tue Jul 23 09:39:15 2024 +0200 Eltwise interface and builders (#214) Fixes #110 commit f2c8b0b Author: Jackson Nie <[email protected]> Date: Mon Jul 22 16:02:14 2024 -0400 Add runtime gtest and linker fix (#209) * Initial cmake changes for runtime gtest infra * Update instantiation of TTNN_LIBRARY * Fix linker error on fresh clone commit ccbcfbf Author: Nick Smith <[email protected]> Date: Mon Jul 22 06:58:15 2024 -0700 Debugging python (#211) commit 25bb6ae Author: Vladimir Milosevic <[email protected]> Date: Mon Jul 22 10:21:20 2024 +0200 Group workflows (#205) Organize workflows by grouping them in the parent workflow - adding top-level workflow 'on-pr-and-push-to-main' to group workflows - adding workflow dispatch to workflows to enable manual runs - added runtime on/off to cache key - update action versions - refactor build.yml matrix to dictionary commit 704905a Author: Muhammad Asif Manzoor <[email protected]> Date: Fri Jul 19 11:14:14 2024 -0400 Add greater or equal op end to end (#200) commit fc4f4ac Author: Vladimir Milosevic <[email protected]> Date: Fri Jul 19 09:52:54 2024 +0200 Show and upload test report (#195) Display test summary and upload artifacts commit 4fdde56 Author: Nick Smith <[email protected]> Date: Thu Jul 18 12:36:53 2024 -0700 Add Code of Conduct (#199) commit 107fd54 Author: Jackson Nie <[email protected]> Date: Thu Jul 18 15:09:56 2024 -0400 Revert "Runtime gtest infrastructure (#122)" (#198) This reverts commit 1a581df. commit 1a581df Author: Jackson Nie <[email protected]> Date: Thu Jul 18 13:53:32 2024 -0400 Runtime gtest infrastructure (#122) * Initial cmake changes for runtime gtest infra * Update instantiation of TTNN_LIBRARY
Initial implementation of stride calculation from the logical shape and the
LayoutAttr
. The strides of the tensors are stored in the Flatbuffer and are used by FE to allocate the tensors.Needed for E2E scenario (issues: #19 #68)