Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add ARMv8.x feature flags #4489

Merged
merged 13 commits into from
Jul 23, 2024
Merged

Add ARMv8.x feature flags #4489

merged 13 commits into from
Jul 23, 2024

Conversation

steven-johnson
Copy link
Contributor

@steven-johnson steven-johnson commented Dec 27, 2019

This allows selecting the ARMv8.x architectures via a new Feature flag. We don't (yet) add any specialization to our codegen (beyond what LLVM will do for us under the hood).

(Note that we should also add similar feature for v9.x architectures, but that will come in a subsequent PR.)

This allows selecting the ARMv8.3a feature set via a new Feature flag. We don't (yet) add any specialization to our codegen (beyond what LLVM will do for us under the hood).
@steven-johnson steven-johnson requested review from abadams and removed request for abadams December 27, 2019 20:43
@steven-johnson
Copy link
Contributor Author

I think this needs a bit more tweaking before review.

@steven-johnson
Copy link
Contributor Author

So I suspect this is probably ok as-is, but given that it may have overlap with arm64e (see #4490), it probably makes sense to hold off landing it until we have a bit more clarity on that issue. (If anyone urgently needs it before then, speak up.)

@alexreinking alexreinking added this to the Eventually milestone Oct 19, 2020
@f0enix
Copy link

f0enix commented May 24, 2021

Hi, i am trying to compile AOT generators for ios arm64e architecture. I found this old PR and applied the changes to the latest branch on master (Note: i skipped the changes to halide.cmake because i could not find it on master)

It can generate the static libraries successfully:

denoise_fill_generator -g denoise_fill_generator -f fill_denoise -e static_library,h -o ./dest target=arm-64-ios-armv83a

But the generated library has architecture arm64 instead of arm64e

lipo -archs fill_denoise.a
arm64

Furthermore i checked the Mach header and it shows the wrong cpu subtype. it should be 2 instead of 0 for arm64e

otool -h fill_denoise.a
Mach header
      magic  cputype cpusubtype  caps    filetype ncmds sizeofcmds      flags
 0xfeedfacf 16777228          0  0x00           1     4        672 0x00002000

I also tried hardcoding the llvm triple in src/LLVM_Runtime_Linker.cpp to force arm64e as a quick hack:

modules[i]->setTargetTriple("arm64e-apple-ios");

This does get the correct architecture but it crashes when i run it on ios.

What am i missing to be able to generate proper arm64e static libraries?

@steven-johnson
Copy link
Contributor Author

@f0enix -- I have updated this PR to be up-to-date with master, but I'm not equipped to debug the arm64e issues, unfortunately (I don't have any iOS devices here, nor will I be able to get any within a week or so, nor do I have enough expertise in iOS development to have confidence in fixing this without a real device at hand). We'd love to get this fixed, though. If anyone out there would like to contribute to fixing this, we'd welcome the help. (Probably worth asking on the Gitter channel as well.)

@f0enix
Copy link

f0enix commented May 25, 2021

@steven-johnson Thanks you for updating the PR. i just tried the latest version and it has same result (the generated static library is arm64)

Note that i built Halide with apple llvm fork instead of the default llvm because i think they have all the handling of arm64e in there. Was that the right way of doing it or should i be using the default llvm?

Other things that i tried:
While trying to look into Halide code generators, i found the variable HL_LLVM_ARGS. So i tried using it to force override the target too:
export HL_LLVM_ARGS="--mtriple=arm64e-apple-ios"
But this failed when running the generator:

llc: Unknown command line argument '--mtriple=arm64e-apple-ios'.  Try: 'llc --help'
llc: Did you mean '--stats=arm64e-apple-ios'?

But when i check the available commands by going directly in the compiled llvm bin folder, i can see that --mtriple is valid but it seems that the llc halide is using is different and cannot find it. What could be the reason for it?

I would like to help but i am very new to Halide and my main area of work is ios app development. I do have a arm64e compatible device to do debugging but for the ARM code generation, i am not sure of how i can help.

@f0enix
Copy link

f0enix commented May 28, 2021

Update:
I have tried with both Apple fork and upstream llvm and both generate arm64 binaries.

i found this extract from Apple XNU pac doc:

Both xnu and first-party binaries are built with LLVM's -arch arm64e flag, which generates pointer-signing and authentication instructions to protect addresses stored in memory

The closing thing i have found to this is the llc -march parameter from llvm docs
So i tried this snippet before calling the generator export HL_LLVM_ARGS="-march=arm64e" but it fails too:

llc: Unknown command line argument '-march=arm64e'.  Try: 'llc --help'
llc: Did you mean '--misched=arm64e'?

The next thing i tried was to edit the LLVM_Runtime_Linker.cpp#L428 in Halide and forcing it to emit only arm64e:

triple = llvm::Triple("arm64e-apple-ios");

This does generate arm64e static libraries but if i try use them it crashes with pointer authentication failure when i run them on my iphone:

Exception Type:  EXC_BAD_ACCESS (SIGSEGV)
Exception Subtype: KERN_INVALID_ADDRESS at 0x20000001073dbd64 -> 0x00000001073dbd64 (possible pointer authentication failure)
VM Region Info: 0x1073dbd64 is in 0x106ddc000-0x10753c000;  bytes after start: 6290788  bytes before end: 1442459

For those who do not have an A12+ iPhone, they can test it on ARM macs.
You can use this command to run non system arm64e binaries on Apple Silicon Mac

sudo nvram boot-args=-arm64e_preview_abi

@steven-johnson
Copy link
Contributor Author

I brought this up to date after some recent discussion on #4490 but I have no idea if it's correct or not -- I have no iOS testing setup locally. Tagged @alexreinking somewhat arbitrarily because of some recent macOS/iOS work he's done.

@steven-johnson steven-johnson changed the title Add ARMv8.3a feature flag Add ARMv8.x feature flags Jun 27, 2024
@steven-johnson
Copy link
Contributor Author

Update for those coming in late: this PR is intended only to enable support for the ARM v8.x (v8.1a ... v8.9a) architectures. It does not add support for arm64e; that is being worked on separately (see #8330) and is orthogonal to this PR (although arm64e support does imply v8.3a or later).

I expect that this PR can land without much drama; the codegen changes it makes are limited to whatever LLVM does for us (there is no attempt at additional specialization yet).

}
if (target.has_feature(Target::ARMv89a)) {
attrs.emplace_back("+v8.9a");
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If target has been completed at this point, isn't this going to add attrs for everything requested and below? Is LLVM cool with that?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Uh, good question.

@@ -2482,7 +2541,7 @@ string CodeGen_ARM::mattrs() const {
}
} else {
// TODO: Should Halide's SVE flags be 64-bit only?
// TODO: Sound we ass "-neon" if NoNEON is set? Does this make any sense?
// TODO: Sound we add "-neon" if NoNEON is set? Does this make any sense?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While you're here, Sound -> Should

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Sound we ass" FTW

src/Target.cpp Show resolved Hide resolved
@steven-johnson
Copy link
Contributor Author

PTAL

@steven-johnson
Copy link
Contributor Author

Gentle review ping

@steven-johnson steven-johnson merged commit bebb888 into main Jul 23, 2024
18 of 19 checks passed
@steven-johnson steven-johnson deleted the srj-armv83a branch July 23, 2024 17:37
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants