Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[tensor-layout] implements LayoutAttr::getStride() #185

Merged
merged 1 commit into from
Jul 23, 2024

Conversation

pilkicTT
Copy link
Contributor

@pilkicTT pilkicTT commented Jul 17, 2024

Initial implementation of stride calculation from the logical shape and the LayoutAttr. The strides of the tensors are stored in the Flatbuffer and are used by FE to allocate the tensors.

Needed for E2E scenario (issues: #19 #68)

Initial implementation of stride calculation from the logical shape
and the `LayoutAttr`. The strides of the tensors are stored in the
Flatbuffer and are used by FE to allocate the tensors.
@pilkicTT pilkicTT requested a review from nsmithtt as a code owner July 17, 2024 13:38
@pilkicTT
Copy link
Contributor Author

@nsmithtt should we also modify ttrt to create tensors from stride?

AffineExpr constantExpr = expr.replaceDims(logicalShapeExprs);
std::int64_t constant =
llvm::cast<AffineConstantExpr>(constantExpr).getValue() + 1;
physicalShape[i] = constant;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This value needs to be aligned up to be a multiple of the respective grid dimension.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note the stride is 36 for inner dim, for logical shape 35 parallelized on 2 cores.

Screenshot 2024-07-17 at 2 19 42 PM

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, above diagram is a bit confusing, it calls "physical shape" something different from how you've defined your function here, but I think that's OK. I'm actually not sure that physical tensor shape the way the tool has defined it is useful, we can redefine this in the document as a follow on, because I like your interpretation of physical shape better.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we are misaligned on what "stride" should mean. I've taken the plain definition of stride as a way to map from logical space of tensor into the contiguous physical memory. In that context, main use for this would be that FE can allocate the memory in advance and know how to interpret it.

Once cores are taken into account and there is no contiguous memory, I'm not sure how to interpret stride? What will the context be in which that stride is used? Do we need a different terminology for this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we have the same definition of stride, but stride doesn't necessarily mean that the data must be tightly packed. For example, you might have 30 scalars per row, but want every tensor row to start on an address multiple of 32, this would mean you need to program a stride of 32, not 30.

Stride here does do a bit of double duty, because it's constrained in such a way that it guarantees divisibility of the grid on the respective dimension. This is how we can infer from how the shard shape should look in programming the memref. I should write additional documentation for this point, but it is touched on in the padding section: https://tenstorrent.github.io/tt-mlir/specs/tensor-layout.html#padding

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was thinking about this a bit more this afternoon. All of my points are a bit moot until we have ttnn tensor layout support and the ability to do zero copy. For now you could pretty much flexibly program stride however you want and it'll work because we have to bounce the tensor through ttnn cpu tensor anyway.

We can check this in as is for now and once ttnn layout is in place I'll file an issue for zero copy support and then we'll need the strides to be tightly coupled to what the compiler and device expect.

@nsmithtt
Copy link
Contributor

@nsmithtt should we also modify ttrt to create tensors from stride?

No it's OK, ttnn doesn't actually have full support for this style layout yet. We'll update runtime once their support lands.

Copy link
Contributor

@nsmithtt nsmithtt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Approving, we'll revisit stride programming when we can actually leverage it for zero copy / runtime stitching.

@staylorTT staylorTT linked an issue Jul 19, 2024 that may be closed by this pull request
@pilkicTT pilkicTT merged commit a46f4f1 into main Jul 23, 2024
3 checks passed
@pilkicTT pilkicTT deleted the pilkic/stride-impl branch July 23, 2024 16:14
vprajapati-tt added a commit that referenced this pull request Jul 29, 2024
commit 5ebfe37
Author: Sasa Vuckovic <[email protected]>
Date:   Mon Jul 29 16:42:29 2024 +0200

    Small refactor to ttnn-to-emitc pass (#229)

    * small refactor to ttnn-to-emitc pass

    * Remove unused constructor in DefaultOpConversionPattern

    * run pre-commit hooks that didn't get picked up

    * remove PassDetail.h

commit 60ad1a8
Author: Usman Aziz <[email protected]>
Date:   Mon Jul 29 10:29:46 2024 -0400

    Update flatbuffers dependency to allow bezel builds without python dependency. (#233)

commit 47a54b1
Author: Nikola Obradovic <[email protected]>
Date:   Mon Jul 29 12:45:28 2024 +0200

    [Optimizer] Setting up rough shape of optimizer pass. Expanding analysis. (#236)

commit 29e618e
Author: Nick Smith <[email protected]>
Date:   Sat Jul 27 15:24:27 2024 -0700

    embed sdk in ccache (#237)

commit 91e6b15
Author: Nick Smith <[email protected]>
Date:   Fri Jul 26 19:51:25 2024 -0700

    Fix mac build; link against MLIR dylib (#239)

commit d817a8f
Author: Radenko Pavlovic <[email protected]>
Date:   Fri Jul 26 14:00:45 2024 +0200

    Change TOSA->TTIR to use dialect conversion API (#230)

    Fixes #178

    Added conversion lib that implements conversion from TOSA to TTIR.
    Did not add new coverage of TOSA ops, it still supports only add, mul,
    sub, geq ops. Type conversion is still a no-op.

    Removed old conversion.

    Added MLIRTTConversions lib that links all conversion libs and is linked
    against TTMLIR.

commit 39ca8e2
Author: Nikola Obradovic <[email protected]>
Date:   Fri Jul 26 09:15:16 2024 +0200

    [Misc] Rename TTIR op interface to TTIROp. (#227)

commit 01c19bd
Author: Ognjen Djuricic <[email protected]>
Date:   Thu Jul 25 09:54:25 2024 +0200

    Change ttir->ttnn to use dialect conversion api (#177) (#221)

commit 1fcda08
Author: Nick Smith <[email protected]>
Date:   Wed Jul 24 12:12:42 2024 -0700

    Add `tt.device` spec #91 (#213)

commit a2f8e01
Author: Predrag Ilkic <[email protected]>
Date:   Wed Jul 24 19:49:25 2024 +0200

    [docs] fix build on linux (#224)

    It looks like the `cp -r` command behaves differently on linux vs macOS.

    `cp -r dir/ new_dir/` will copy contents of the `dir/` into `new_dir/`
    on macOS, but on Linux, it will copy the directory, so you would end
    up with `new_dir/dir/`.

    Hence, changing the command to be:
    `cp -r dir/* new_dir/` which works the same on both platforms.

    Updating the docs for building docs as well.

commit 7345e48
Author: Milan Topalovic <[email protected]>
Date:   Wed Jul 24 12:11:54 2024 +0200

    Adding translation from ttnn to flatbuffer (#216)

    We can now use ttmlir-translate tool to translate from TTNN IR to flatbuffer. Example:

    `./build/bin/ttmlir-opt --ttir-to-ttnn-backend-pipeline test/ttmlir/Dialect/TTNN/simple_matmul.mlir | ./build/bin/ttmlir-translate --ttnn-to-flatbuffer -o out.ttnn`

    To get flatbuffer from tt-forge we call std::vector<uint8_t> ttnnToFlatbuffer(Operation *op).

    resolves #120

commit 5cc95a4
Author: Stefan Djordjevic <[email protected]>
Date:   Wed Jul 24 10:25:56 2024 +0200

    Rewriting pipelines and transformation passes headers in LLVM style (#219)

commit a46f4f1
Author: Predrag Ilkic <[email protected]>
Date:   Tue Jul 23 18:14:09 2024 +0200

    [tensor-layout] implements LayoutAttr::getStride() (#185)

    Initial implementation of stride calculation from the logical shape
    and the `LayoutAttr`. The strides of the tensors are stored in the
    Flatbuffer and are used by FE to allocate the tensors.

commit 6841e4d
Author: Nikola Obradovic <[email protected]>
Date:   Tue Jul 23 09:50:31 2024 +0200

    [Optimizer] Per op grid overrides. (#206)

commit 2574010
Author: Radenko Pavlovic <[email protected]>
Date:   Tue Jul 23 09:39:15 2024 +0200

    Eltwise interface and builders (#214)

    Fixes #110

commit f2c8b0b
Author: Jackson Nie <[email protected]>
Date:   Mon Jul 22 16:02:14 2024 -0400

    Add runtime gtest and linker fix (#209)

    * Initial cmake changes for runtime gtest infra

    * Update instantiation of TTNN_LIBRARY

    * Fix linker error on fresh clone

commit ccbcfbf
Author: Nick Smith <[email protected]>
Date:   Mon Jul 22 06:58:15 2024 -0700

    Debugging python (#211)

commit 25bb6ae
Author: Vladimir Milosevic <[email protected]>
Date:   Mon Jul 22 10:21:20 2024 +0200

    Group workflows (#205)

    Organize workflows by grouping them in the parent workflow

    - adding top-level workflow 'on-pr-and-push-to-main' to group workflows
    - adding workflow dispatch to workflows to enable manual runs
    - added runtime on/off to cache key
    - update action versions
    - refactor build.yml matrix to dictionary

commit 704905a
Author: Muhammad Asif Manzoor <[email protected]>
Date:   Fri Jul 19 11:14:14 2024 -0400

    Add greater or equal op end to end (#200)

commit fc4f4ac
Author: Vladimir Milosevic <[email protected]>
Date:   Fri Jul 19 09:52:54 2024 +0200

    Show and upload test report (#195)

    Display test summary and upload artifacts

commit 4fdde56
Author: Nick Smith <[email protected]>
Date:   Thu Jul 18 12:36:53 2024 -0700

    Add Code of Conduct (#199)

commit 107fd54
Author: Jackson Nie <[email protected]>
Date:   Thu Jul 18 15:09:56 2024 -0400

    Revert "Runtime gtest infrastructure (#122)" (#198)

    This reverts commit 1a581df.

commit 1a581df
Author: Jackson Nie <[email protected]>
Date:   Thu Jul 18 13:53:32 2024 -0400

    Runtime gtest infrastructure (#122)

    * Initial cmake changes for runtime gtest infra

    * Update instantiation of TTNN_LIBRARY
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Adopt new device runtime
3 participants