-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CoreCLR on ARMv7 #4345
Comments
Can't you just change the cast to a (va_list) instead of a (void*)? |
I don't believe that works, you mean as in the following?
Clang will give
and an online gcc appears to give
|
Wait a sec, I checked the test and it currently is:
Are you not up to date with master? |
@kangaroo yup he is not up to date here is his branch https://github.com/benpye/coreclr/tree/linux-arm which is four commits behind. |
This test has never changed on master. This appears to be some local changes? |
It was so that it built without PAL to ensure it wasn't a PAL related issue. I do believe PAL defines NULL as (void*)0 or something to that effect so I was trying to ensure it wasn't a difference between this "NULL" and PALs NULL. Either experiences the same compile error. |
Further, this inconsistency with va_list actually appears to cause some of the other test failures too. I believe that at least the SetCurrentDirectoryA test failures are due to this. If we assume that on x86 that va_list is a pointer (as indicated by the ability to cast from a void* ), then when we call NativeVsnprintf (src/pal/src/cruntime/printfcpp.cpp) which then calls vsnprintf, we are passing a pointer to the real object behind va_list. This behaviour is actually assumed by PAL currently. Once this function has returned it is assumed that va_list will of had the relevant arguments taken off of it, leaving the next argument to be the next one to be handled. On x86_64 this works fine, due to the implementation. On ARM this is not the case, it seems reasonable to assume on ARM that va_list is backed by some type, but not a pointer (as we cannot cast a void*). This means that the call to NativeVsnprintf is passing a copy of the va_list, and so when the call returns, the va_list we have is still in the same state, and so we take the same argument again. This is shown in the test file_io/SetCurrentDirectoryA/test1, here there is a line It seems if PAL is to be ported to architectures other than x86_64, it may be nessecary to provide our own vsnprintf and vfprintf instead of using the system provided versions. Whilst perhaps undesirable, I don't see any other way to ensure that va_list behaves as expected. The current implementation actually violates the C language specification, section 7.16.3 states that
|
You should probably indicate what platform and environment you are using for developement. The C compiler used on non-windows is LLVM. It is the same on x86, x64 and ARM if the version matches. |
There are many ARM dev boards out there that run the same version of Ubuntu that x64 linux does. A better approach to development would probably be to use a QEMU ARMv7 emulation on X86/64 so that those without ARM hardware can participate. |
I've tried to keep the environment as close to that as the standard Linux build environment. I'm actually using a Raspberry Pi 2, with Ubuntu 14.04, and LLVM/Clang 3.5. I'm sure Qemu would be fine too, though currently I'm detecting only armv7l, which is probably a subset of what we can target. |
As I understand it, this is a Linux ARMv7 build. The C/C++ code should use the same compiler on the Linux x64 which is LLVM. If that environment does not have an issue with va_list, then the ARM version should not either since its the same LLVM compiler. This may indicate that cmake configuration is not correct. |
The behaviour expected of va_list is undefined in the spec, it even states (as I've quoted above), to explicitly not do what PAL is doing. https://github.com/dotnet/coreclr/blob/master/src/pal/src/cruntime/printfcpp.cpp#L1779 shows one case where this is being done. va_list is dependent on the ABI, as I was reading into this some people indicated that this code would not work on x86 instead of x86_64 either, although I haven't tried. The fact it doesn't work on ARM is unsurprising given the considerable difference in the ABI. |
After further research, it seems that on ARM the ABI defines the va_list. However, not everyone (Apple) uses it. So its probably worth investigating abstracting it in CoreCLR to meet the needs of various targets. |
That doesn't concern us if we stick to the C specification, we shouldn't have to care about the va_list implementation (I do not believe). It doesn't need abstracting I don't think. |
We have to be able to marshal it. So that implies something in the runtime has to have knowledge of how to do that for the target platform. http://bartdesmet.net/blogs/bart/archive/2006/09/28/4473.aspx |
I have implemented CONTEXT_CaptureContext in assembly for ARM as on AMD64, and this appears to function correctly. Additionally, the unwinding by libunwind appears to work also. I am having trouble however tracking down an issue, when an exception is thrown by RaiseException, it is never caught, instead the following is printed in the terminal
Any ideas? EDIT: After some further investigation this turns out to be a libunwind issue. Linking anything with libunwind on ARM seems to break C++ exceptions. This is actually mentioned in a bug report from 2009 on the RedHat tracker https://bugzilla.redhat.com/show_bug.cgi?id=480412 . It's interesting that AMD64 doesn't encounter it, perhaps libunwind is used in Clang there. It's solved by linking gcc_s before libunwind on ARM anyway, so the only PAL test failures now seem to be the result of the faulty vsnprintf implementation and RtlRestoreContext should still be implemented in assembly for the same reason as on AMD64. |
I'm going to guess that the unwinding issue is because libunwind exports its own _Unwind_RaiseException, and various other Unwind* functions. They're probably not binary compatible with gcc, which is causing landing pads to be skipped. This also makes sense that linking gcc_s first 'fixes' it. If you'd like to confirm, try building libunwind without UnwindLevel1*.c, and link it first. I bet that works too. EDIT: Sorry, its Unwind-EHABI.cpp. Also the above refers to llvm libunwind, if you're using non gnu, adjust accordingly. |
I do agree, and gcc_s provides the default unwinding functions. It's apparently because libunwind lacks a function that GCC (and presumably Clang) rely on for exception support, at least, that's what the old bug report seemed to indicate. I can try building my own libunwind but I would at least think linking gcc_s first so that it's exports are used would be better since then we are using the system libunwind, I'll gladly be proven wrong however. Again, I don't see why this issue doesn't show up on the FreeBSD and Linux builds for AMD64, they are all using Clang and libunwind etc... |
Try libunwind from llvm, it seems to have EHABI support:
EDIT: I'm not sure if its binary compatible with whatever gcc_s is doing. |
It appears that if I build libunwind from llvm, it also works. This does require changing the libunwind build script however, as it cannot find the |
I was more curious wether LLVM libunwind was binary compatible. We'd like to move to LLVM libunwind in the long term, but have held off due to the lack of context pointer support. That said, if non gnu is incompatible with ARM EHABI, we may want to consider the strategy there. I'll try to carve out some time to take a look at exactly whats going on here. Lack of context pointers isn't a huge concern. We'll take a minor perf hit due to increased stack pressure, but in the grand scheme of things its not something to worry about. |
Added RtlContextRestore in assembly, and added floating point register save restore to RtlContextCapture/RtlContextRestore. Hit a slight wall though, whilst our context object has the FP registers, I cannot find where they are defined in ucontext_t, the normal registers are in mcontext_t, but the FP registers appear missing, needed for the equivalent FPREG_ macros used to convert between our context and the libunwind/linux context. I have assumed VFPv3 here and will be changing the build script to select ARMv7, VFPv3, and importantly no NEON. This appears to be what is required by the jit looking at the register list at least, and is also the baseline for Ubuntu armhf, which if keeping with the AMD64 version, should be our supported distro, so it seems a sensible feature set to target. This will also allow scaleway.io to be used for those who do not have local ARM boards, though I've yet to setup anything there. EDIT: Bionic has an interesting file, https://android.googlesource.com/platform/bionic/+/c124baa/libc/arch-arm/include/machine/ucontext.h . It indicates only d8-d15 are saved and that there is "no reliable way to extract the FP state from a context_t on ARM" as the registers are not exposed in any clear way. |
On linux kernels, co-processor registers are stored in uc_regspace.
|
Hm okay, as the Bionic file stated. It still seems glibc is only saving d8-d15 and fpcsr. Isn't that a problem? VM contains further "va_list ABI abuse". https://github.com/dotnet/coreclr/blob/master/src/vm/clrvarargs.cpp @OtherCrashOverride was ahead here but need to find how to handle this on ARM, I guess Windows uses the same ABI for ARM as other architectures where Linux it changes. |
Posting this link here for future reference Of note is: "For non-variadic functions, the Windows on ARM ABI follows the ARM rules for parameter passing—this includes the VFP and Advanced SIMD extensions." It does not mention how variadic functions are called, though. |
Yeah the code indicates Windows has the same pointer to an object for the va_list. On Linux we are going to have to handle the ARM EABI. |
The refernced ARM article: has this to say: page 19 - "5.5 Parameter Passing" page 27 - "7.1.4 Additional Types"
|
http://llvm.org/viewvc/llvm-project?view=revision&revision=212004
This could be good news. If I interpret this correctly (and I very well may be wrong)
for ARM EABI:
So for a given use of va_list on Windows it can be converted to ARM as:
and converted back as:
|
Just so that the current status of this code is known. State in upstream is non functional, PR dotnet/coreclr#1285 is tracking the unwinding support though it is currently blocked by https://llvm.org/bugs/show_bug.cgi?id=24146 as far as I can tell. If you want to work on the JIT then the code is in a state where that is possible, currently the RyuJIT ARM backend is very incomplete, so that's going to be my main focus in the near future. Outside of the JIT there are likely other issues however without unwinding it's going to be difficult to find them as GC will be non functional. The legacy JIT backend is functional, however the code path will likely be removed in the near future. It can be used by modifying
This will make the build use the legacy JIT backend as opposed to the RyuJIT backend. Another workaround I've used to try and test the unwinding is to change |
If anyone on the JIT team could give some indication how I would correctly implement calls for locations outside the 24 bit immediate range it would be of great help. In the legacy code cc: @BruceForstall |
I haven't looked at this in quite a while. Presumably for JIT the VM should create jump islands for >24 bit branches (we don't support individual functions greater than that size). However, it doesn't actually look like it does that. Check out IMAGE_REL_BASED_THUMB_BRANCH24. For NGEN (crossgen for .NET Core?) it tries compiling everything without requiring that, then if it needs it, it recompiles (see ZapInfo::getRelocTypeHint). Also see arm_Valid_Imm_For_BL() usage for doing "large" calls. |
Yeah for non NGEN it seems that arm_Valid_Imm_For_BL() will always return false. The existing code for ARM gets a register, loads the address into the register, and does |
So you are trying to implement the ARM32 back-end in the RyuJIT (non-legacy) path? In that case, you need to "reserve" a register in Lowering::TreeNodeInfoInit(), and then "grab" it using something like genRegNumFromMask(treeNode->gtRsvdRegs). If necessary, maybe you could define REG_DEFAULT_HELPER_CALL_TARGET to some non-argument callee trash register (R12?) |
@BruceForstall If doing it that way feels like reimplementing the legacy ARM JIT on top of RyuJIT, what would be "The Right Thing To Do" in order to resolve this problem? |
Sorry, I didn't mean to imply that. I was a little confused from the comments about whether @benpye was trying to get the ARM legacy back end to work better, or trying to start getting the new RyuJIT ARM back-end work started. What I suggest in my last comment is the correct way to go for RyuJIT back-end. |
Ah, all right. I guess I misread that. |
Going through to check build status once again. The new dep on liblttng brings in a library liburcu which is broken on ARM in Trusty with clang. We require version 0.7.15 or later with the commit urcu/userspace-rcu@7a3e2ed for it to build else Clang will error. |
So Ubuntu is shipping an outdated library that won't build? That's a bit odd... |
The ubuntu urcu header won't build with newer clang. It builds just fine with gcc and older clangs. |
Clang 3.5 is too new however? I suppose Ubuntu 16.04 will be okay by the time that releases but that is some time in the future. |
…7 yet. The issue about that https://github.com/dotnet/coreclr/issues/1192
Hi, Executing corerun with the helloworld program segment faulted.. I'd like to know if and what can be done (or what do we need to wait for) in order to make coreclr run on Armv7 ? We migrate large scale project into AspNetCore, and we need it not just on Linux64 arch, but also on 32 bit systems, mainly on Arm (and the Pi is one of them). |
@aviviadi official support for x86 and arm32 will come after version 1 release. The actual work required for arm32 and x86 is in ryujit. The current work on linux arm32 and aarch64 is mostly from community from @benpye and @kangaroo , but now ryujit for aarch64 is mostly functional. as far as my limited understanding goes. |
@shahid-pk I will have to wait then. Thank you for your answer. |
@aviviadi I just tried a hello world on a Jetson TK1 board with the latest CoreCLR and it works albeit segfaulting on exit. Make sure you only have mscorlib.dll (no mscorlib.ni.dll generated on the host where you build mscorlib), I had the issue and this caused an immediate segfault. |
@manu-silicon Well.. that's interesting. As I understand from @shahid-pk and by reading elsewhere - x86/arm32 support is not planned for the near future. I will have time later to try again (I don't remember having mscorlib.ni.dll but I will give it another go). |
The ARM progress has been more recently tracked in https://github.com/dotnet/coreclr/issues/3977 |
This is ultimately probably part of #4296 . Currently I am working on bringing up PAL on ARMv7. This is currently going reasonably well, however I feel it would be wise to open an issue regardless. In addition, Clang and GCC exhibit some different behavior regarding va_lists when on ARM instead of x86/AMD64.
The following shows the issue, it's basically the pal test c_runtime/vprintf/test1 which is where I realised the issue. Under x86/AMD64, this code will compile fine. When targeting ARM however this will fail to compile complaining that a void * is incompatible with va_list. What is the correct action to take here? Currently in my branch I have disabled the test, but this is probably not ideal.
The text was updated successfully, but these errors were encountered: