Make exegesis conversion script use appropriate register class #37

boomanaiden154 · 2024-01-27T08:41:33Z

This patch fixes the register class that the BHive to Exegesis conversion script uses. Currently it is only pulling in registers that don't have a REX encoding, which doesn't even include R9-R15. This patch fixes that behavior by using the more generic LLVM 64-bit GPR Register class, but specifically looking at registers that don't require a REX2 encoding, as there is no hardware in the wild yet that supports APX.

boomanaiden154 · 2024-01-27T08:42:41Z

This does still include RIP, but that's an upstream LLVM issue.

mtrofin · 2024-02-03T07:10:31Z

Are we missing testing coverage here?

boomanaiden154 · 2024-02-03T07:16:02Z

Are we missing testing coverage here?

There's no test coverage currently for this script. It would be best suited for lit/FileCheck based testing, but this is built with Bazel and it's less trivial to setup a lit test suite from within Bazel. It seems like Tensorflow has some infrastructure to do it (https://github.com/tensorflow/tensorflow/blob/master/tensorflow/compiler/mlir/glob_lit_test.bzl#L66) though if we want to add something eventually.

ondrasej · 2024-02-03T15:57:51Z

This does still include RIP, but that's an upstream LLVM issue.

Do we need to filter it out somewhere?

boomanaiden154 · 2024-02-03T20:49:47Z

Do we need to filter it out somewhere?

It's not strictly necessary as it'll just end up getting encoded as something different (I believe %RAX). I've removed it as it definitely is weird though. I was originally intending to fix it on the LLVM side as there is a FIXME there, but given changing that causes 1000+ test failures and I'm not really familiar with that area of the backend, I'm not particularly interested in looking at it currently.

ondrasej

One minor comment, otherwise it looks OK.

gematria/datasets/convert_bhive_to_llvm_exegesis_input.cc

ondrasej · 2024-02-04T23:25:56Z

It's not strictly necessary as it'll just end up getting encoded as something different (I believe %RAX). I've removed it as it definitely is weird though. I was originally intending to fix it on the LLVM side as there is a FIXME there, but given changing that causes 1000+ test failures and I'm not really familiar with that area of the backend, I'm not particularly interested in looking at it currently.

Gotcha. RIP can't be updated directly via MOV instructions (it doesn't have an index in the binary encoding), and IIRC it can only be read indirectly (via a LEA instruction or similar), and can only really be used in address computation.

I'm afraid that if we try to emit code that reads or writes RIP directly, weird things would happen (and LLVM would not catch them, because (IIRC) most of the validation happens on assembly parse time; onwards, the code just assumes that the MCInsts are correct, and does not check it.

boomanaiden154 · 2024-02-04T23:31:27Z

Gotcha. RIP can't be updated directly via MOV instructions (it doesn't have an index in the binary encoding), and IIRC it can only be read indirectly (via a LEA instruction or similar), and can only really be used in address computation.

I'm afraid that if we try to emit code that reads or writes RIP directly, weird things would happen (and LLVM would not catch them, because (IIRC) most of the validation happens on assembly parse time; onwards, the code just assumes that the MCInsts are correct, and does not check it.

Right. It can't be updated, using those instructions. Once they end up getting written out, the instruction encoding doesn't even attempt to write to %RIP and instead writes to a different register (usually %RAX from what I've seen) due to some detail with instruction encodings that I'm forgetting currently.

It's a completely harmless thing (at least for us given all the registers are set to the same value), but a bit weird.

There is a machine code verification step that runs within llvm-exegesis (at least if assertions are enabled), but it doesn't catch this for whatever reason (probably because the instructions are only verified to be using the appropriate register class and %RIP is in the GP64 register class...). Might be something to look into adding at some point and I think would fix some open issues in LLVM on top of refactoring the register classes to be cleaner. I don't think that would be a trivial effort though.

boomanaiden154 requested a review from ondrasej January 27, 2024 08:41

Don't output RIP

01e3061

ondrasej approved these changes Feb 4, 2024

View reviewed changes

gematria/datasets/convert_bhive_to_llvm_exegesis_input.cc Outdated Show resolved Hide resolved

Address reviewer feedback

4762adb

boomanaiden154 merged commit f782e79 into google:main Feb 5, 2024
6 checks passed

mtrofin mentioned this pull request Mar 26, 2024

Exeriments: add missing #include <iterator> mtrofin/gematria#1

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Make exegesis conversion script use appropriate register class #37

Make exegesis conversion script use appropriate register class #37

boomanaiden154 commented Jan 27, 2024

boomanaiden154 commented Jan 27, 2024

mtrofin commented Feb 3, 2024

boomanaiden154 commented Feb 3, 2024

ondrasej commented Feb 3, 2024

boomanaiden154 commented Feb 3, 2024

ondrasej left a comment

ondrasej commented Feb 4, 2024

boomanaiden154 commented Feb 4, 2024

Make exegesis conversion script use appropriate register class #37

Make exegesis conversion script use appropriate register class #37

Conversation

boomanaiden154 commented Jan 27, 2024

boomanaiden154 commented Jan 27, 2024

mtrofin commented Feb 3, 2024

boomanaiden154 commented Feb 3, 2024

ondrasej commented Feb 3, 2024

boomanaiden154 commented Feb 3, 2024

ondrasej left a comment

Choose a reason for hiding this comment

ondrasej commented Feb 4, 2024

boomanaiden154 commented Feb 4, 2024