-
Notifications
You must be signed in to change notification settings - Fork 49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Compiling with CeedScalar defined as float #784
Comments
Thanks for taking the time to test and write this up. Issue 1: function pointersWe don't ever call though the pointer and thus should use the cast, either as you mention or using int (**fpointer)(void) = (int (**)(void))((char *)object + offset); // *NOPAD*
*fpointer = f; Issue 2: type-aware dispatchWe should probably make a scalar types enum. The number of bytes Issue 3: tolerancesWe should define a type-specific machine epsilon and make tolerances use that. We may want an inline toleranced comparison utility function. And it may be that some tests just aren't appropriate for single, in which case we need a lightweight way to skip them. |
Issue 1I originally chose Issue 2I tried adding an
Then in the code we can use, e.g., Issue 3I see there is a |
|
Then, in
And of course, This works nicely overall, with a few remaining issues. I disabled the AVX tensor contractions and MAGMA non-tensor basis creation if Some more specifics on remaining issues with the tests: Question: what to do with the Fortran tests?I found in Alternatively: we could just not run the Fortran tests for float? Back to test tolerances: more detailsStarting from a list of tests that failed in float, I went through and replaced the tolerance with something based on the double precision A separate remaining issue pops up with some of the tests that compare to things in
A caveatI have not yet looked in the Python, Julia, or Rust bindings to see if there are any problems (I suspect there could be -- I did notice a separate |
I wonder if it is worth reworking our Fortran interface while we are in there to support more modern Fortran since Nek5000 is switching to NekRS using C++ with OCCA instead of incorporating libCEED. |
I'm curious what is known about libCEED's current or likely Fortran users -- do they need Fortran 77, or is 90/95 okay (or newer)? We could potentially modernize our tests without changing the interface, when it comes to the scalar types (I think). I've looked at updating the Python bindings/tests -- it was more involved than I realized before I started, but I think I got it all working (except for the few tests which compare to output rather than test with a tolerance). Though I'm sure there are other options than what I did to get it working, so we could discuss/change it. I've never really worked with Rust or Julia before, so I may need some help there. For example, how do I best propagate the information in the "extra" included header file (for f32 or f64) to the Julia bindings? (ping: @pazner) I'm working on a draft PR for my branch right now, assuming we want to go ahead and pursue this all the way to solving the various outstanding issues and merging it, so maybe we can continue the discussion over there once it's opened. Edit: Also, I apologize for spamming everyone with build failure emails when I push to this branch. Making progress though -- at least Python doesn't fail anymore :) |
The only reason we added Fortran 77 support was because of Nek5000, but with Nek RS a) being in C++ and b) not being interested in libCEED, this is not really relevant. I haven't done a large survey of Fortran projects or anything, but I would assume that supporting Fortran 90 would be sufficient for any hypothetical Fortran users we may have in the future. I already moved our Fortran tests away from fixed form, but I would support further updates to those tests. We should be able to update those tests without fully updating to Fortran 90 if we want to retain our Nek5000 example (which currently isn't running in CI for unknown reasons) |
As a first step toward #778, I tried building and testing libCEED with
CeedScalar
set tofloat
instead ofdouble
. I ran into some minor issues. I was thinking we may want to address some or all of them in a PR prior to adding any mixed-precision functionality. (Of course, we would probably want to keep any potential future changes for mixed precision in mind when deciding what to do about these issues, or whether we want to address them at this time at all.)Issue 1: Compiler warnings for three
CeedVector
functionsThe affected functions are
CeedVectorSetValue
,CeedVectorScale
,CeedVectorAXPY
, which are currently only implemented in the cuda/hip-ref backends.The warnings come from
CeedSetBackendFunction
, e.g.The reason only these three functions are affected is that they are the only ones that take a parameter of type
CeedScalar
(rather thanCeedScalar *
). From the C standard, section 6.5.2.2, a function declaration with unspecified parameters will result in "default argument promotion," which includes convertingfloat
todouble
. And from section 6.7.5.3, that means the function declaration ofint (*f)()
is incompatible with a prototype containing aCeedScalar
parameter whenCeedScalar
isfloat
, but it is compatible whenCeedScalar = double
.Now, from what I can tell, this incompatibility doesn't actually cause incorrect execution of the three functions, since the functions are still called through function pointers with the correct parameter types for the backend implementation, from the function pointers in the
CeedVector
object. Perhaps this is obvious to some libCEED developers/expert-level C coders, but I struggled a bit to make sure I understood the distinction between what libCEED does and things that would produce errors in this case. To convince myself, I confirmed this behavior from a minimal test code. Consider three functions with prototypesint func_dbl(double x)
,int func_flt(float x)
, andint func_pflt(float *x)
and the followingmain()
:I tried this with GCC 8.3.0 and clang 11.0.1. Both gave the "incompatible pointer type" warning seen in compiling the cuda-ref backend for the line creating
fbar_u
, and both produced the same runtime behavior. So there's no problem with the functions' behavior in single precision, despite the warnings.Still, we most likely wouldn't want these warnings to be produced for every backend that implements these three functions (though it's currently only cuda-ref and hip-ref).
A simple solution, though maybe not "nice" since it would ruin the symmetry of setting up all the backend functions of a
CeedVector
object, would be to perform a cast on these three functions duringCeedSetBackendFunction
, e.g.This removes the compiler warning and is valid per C function pointer conversion rules. I think this is preferable to having these functions take
CeedScalar *
instead, given how the variables are used in the functions. However, I would like to get input on this from other libCEED developers.Issue 2: Places where we need to know the type of
CeedScalar
There are several places we would need to know exactly what type
CeedScalar
is, e.g.:In the runtime compilation for CUDA and HIP backends, namely the
Ceed[Cuda/Hip]Compile
function,where the definition of
CeedScalar
is added directly to the code string to be compiled,to avoid having to include any headers
In functions that call library routines/instructions for a specific precision: MAGMA non-tensor
routines (GEMMs), AVX basis routines, the vector norm functions for CUDA/HIP backends...
Perhaps in the tests (see issue 3).
The question here is whether we want the user to have to define anything else while compiling, or only change
CeedScalar
. One option I tried to get the CUDA/HIP RTC working was to create variables of typeCeedScalar
,float
, anddouble
, then compare their sizes to determine which definition to add to the code string. Perhaps something like that could be done in a general helper function somewhere, so that any routine can check? Or is there a more elegant way to do this in C?Of course moving toward mixed precision support in the future, we may just want to make it so that the compilation functions take a parameter/parameter(s) indicating the main/computational precision of the code to be compiled (since the same compilation functions are used for all kernels, even if not all kernels will have the option for differing computation and argument precisions, the compilation functions must be able to support it).
Issue 3: Required changes to libCEED tests
Many libCEED tests fail when
CeedScalar
is float. In the case of t104, it just needs a cast toCeedScalar
in the comparison, e.g.if (a[3] != (CeedScalar)(-3.14))
, otherwise the comparison is done in double and fails. Many other tests seem to fail due to the tolerances being set too low for what we are guaranteed in single precision, though of course not all of them do.However, these tolerances are not all the same and seem tailored to the individual tests, somewhat complicating the situation if trying to set different tolerances based on
CeedScalar
(or future mixed-precision routines). This might also require a way to check whatCeedScalar
is or another definition from the header (mentioned in issue 2).The text was updated successfully, but these errors were encountered: