-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[runtime] initial support for running model on device #8
Conversation
It seems like |
Nvm, it makes sense after taking a look. I think the only thing that's a bit confusing is that there are 2 sets of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great!!
8fb96a7
to
47bb5ff
Compare
- moves device specific code from `tt_torch_device/` into `runtime/` - adds `TTSystem` class (singleton) for holding all info on present devices - runs mlir compiler as a separate compile stage, which at the end generates flatbuffer binary - implements `CompiledModel` class for running inference on compiled model - `run_binary()` is the function which invokes tt-mlir runtime NOTE: with this commit, the following tests are passing: - pybuda/test/test_api.py - pybuda/test/mlir/test_ops.py::test_add - pybuda/test/mlir/test_ops.py::test_multiply
5ac06d4
to
35b0f5c
Compare
Yeah, I was thinking of moving it to under runtime, e.g. |
tt_torch_device/
intoruntime/
TTSystem
class (singleton) for holding all info on present devicesrun_binary()
is the function which invokes tt-mlir runtimeNOTE: with this commit, the following tests are passing: