Add the ability to parse/format a float/double from/to a hexadecimal literal #1630

dovisutu · 2020-01-11T13:07:21Z

Part of #27204

In IEEE 754:2008 part 5.12.3, transfering a float/double from/to an external hexadecimal-significand character sequence representing finite number is requested while we don't have it yet. The pattern is like this: (regex)

[+-]?0[xX](?:[\da-fA-F]*\.[\da-fA-F]+|[\da-fA-F]\.?)(?:[pP][+-]?\d+)?

notice that this is slightly different to the standard based on the discussion below, which talked about the exponent part

which means:

valid	invalid
+0x7ff.3edp+1	+7ff.3edp-1
0x7ff.3edp+1	0x7ff.3ede+1
0x7ff.3edp1	0x7ff.3uup1
+0x7ff.3edp1	0x7ff.3ed
+0X7FF.3EDP1	0X7FF.3ED
-0x7ff.3edp1	0x7ff_3edp-1
0x7ff.	+-0x7ff.3edp-1
0x7ff	0x7fu.3edp-1
0x.3edp-1	0x.p-1

benefits

Easier parsing for native hexadecimal float/double. for example, 0x0.ffp0 is the equivalent of 0.99609375 while using less chars. Same for formatting as it reduces the size of string to transfer.
Parsed number is percise in the limit of float/double's limitations. Since both float and double are based on raw bit, transistion from hexadecimal literal to them would be easier and without rounding if they are in the limitation. parsing decimals, on the other hand, would often has to round.
It's in IEEE 754:2008/2019 standard, so it's necessary to add it.

proposal APIs

namespace System.Globalization {
    [Flags]
    enum NumberStyles {
        // ...,
        HexFloat = AllowLeadingWhite | AllowTrailingWhite | AllowLeadingSign | AllowHexSpecifier | AllowDecimalPoint
        // ,...
    }
}

And let Numbers.ParseDouble/Single(string s, NumberStyles style[, NumberFormatInfo info]) accept Numberstyles.HexFloat (or it without NumberStyles.AllowDecimalPoint) and correctly parse string.

Edit Numbers.FormatDouble/Single(ref ValueStringBuilder sb, double/float value, ReadOnlySpan<char> format, NumberFormatInfo info) so that they can correctly identify X specifier (which is also used for outputting integers in hex) which may have a trailing precision specifier, and correctly format it.

The text was updated successfully, but these errors were encountered:

stephentoub · 2020-01-11T13:16:56Z

cc: @tannergooding

Gnbrkm41 · 2020-01-11T14:28:49Z

Tracked by #1387?

tannergooding · 2020-01-11T18:07:45Z

Thanks for opening this, its good to see additional customer wants for these areas 😄

It is partially tracked by https://github.com/dotnet/corefx/issues/31901 (which is IEEE 754:2008 compliance) and by #1387 (which is IEEE 754:2019 compliance), but those are largely meta issues and the individual proposals will be easier to take through API review.

The format specifier can be broken down into:

sign:           [+-]?
hexIndicator:   0[xX]
hexSignificand: (?:[\da-fA-F]*\.[\da-fA-F]+|[\da-fA-F]+\.?)
decExponent:    [pP][+-]?\d+

The terminal [fFdD]? listed is not actually part of the IEEE specification and should be excluded.
I also updated hexSignificand to clarify that just . is not valid.

The computed value is hexSignificand * 2^decExponent.

So, for example if you have 0x1.234p0 this is:

0x1 == 1
0x234 * 16^-3 == 564 * 16^-3 == 0.1376953125
1.1376953125 * 2^0 == 1.1376953125

tannergooding · 2020-01-11T18:08:54Z

Even though this isn't a new API, I believe we still want to take it through API review since it is modifying an existing API.

We would want to check it against the compat bar and make the necessary decisions around what flags would be used to support this functionality and ensure that we wouldn't accidentally introduce any breaking changes, etc.

dovisutu · 2020-03-12T12:47:32Z

I've updated the issue description to show what changes should be made. Free to discuss.

dovisutu · 2020-03-13T04:58:18Z

(Sorry for the mess when I edit the title)

reflectronic · 2020-03-26T14:06:39Z

Would this change also allow more combinations with AllowHexSpecifier? e.g. combining it with only AllowLeadingSign (without AllowDecimalPoint).

tannergooding · 2020-03-26T14:46:37Z

I don't see why it wouldn't, provided it was still valid according to the IEEE requirements.

dovisutu · 2020-03-27T11:47:09Z

I don't see why it wouldn't, provided it was still valid according to the IEEE requirements.

Yeah, it makes sense, but that is out of scope - which should be parsed into signed integer, not floating point nunber - so I think it should be proposed in a separate issue, though.

tannergooding · 2020-03-27T16:48:10Z

I don't believe its out of scope, as I said as long as its still valid according to the IEEE requirements, it should be fine.

There are "paths" where the decimal point isn't required and so the user should fully be able to specify that hex is allowed but decimal-point is not. It's an advanced scenario, but the number parser already supports it and will continue doing so even if we add support for hex, so there is no reason to block it. The interesting scenario would be whether or not to allow the exponent to be optional and to default to 0 (which would still be valid based on the normal conversion rules, etc).

sign:           [+-]?
hexIndicator:   0[xX]
hexSignificand: (?:[\da-fA-F]*\.[\da-fA-F]+|[\da-fA-F]+\.?)
decExponent:    [pP][+-]?\d+

The sign is optional but must be + or -
The hex indicator is required and must be 0x or 0X
The hex significand is required and must be one of:
- one or more hex digits
- one or more hex digits followed by .
- one or more hex characters followed by . followed by one or more hex digits
- . followed by one or more hex digits
The decimal exponent is required and must be p or P followed by an optional sign followed by one or more decimal digits

dovisutu · 2020-03-28T03:42:05Z

Oh, right. numbers like +0x23p1 can also be parsed as float... So this is OK to be valid (if users doesn't want to accept input that has .). You are right. ~~Let Int.Parse() accept it, however, is out of scope.~~

But about exponent part... I saw an explanation of this being mandatory in the standard because it may create confusion like:

0x1.23e+1

While it clearly only parses into 1 state:(0x1.23e) + 1, but users may see it as a valid number.

But in c(++)'s implementation this is optional:
(strtod)

A 0x or 0X prefix, then a sequence of hexadecimal digits (as in isxdigit) optionally containing a period which separates the whole and fractional number parts. Optionally followed by a power of 2 exponent (a p or P character followed by an optional sign and a sequence of hexadecimal digits).

I think when parsing, let the exponent optional is OK, because you can't parse an expression into a floating-point number.

But it should be mandatory when we (possibly?) move this to a literal value in c#/vb/f# to avoid confusion. However, lefting it optional may still make sense because when users are aware of using this, they should already know that the exponent of hexadecimal float is p, not e.
This part I deleted is out of scope and should be considered by the language design group.

dovisutu · 2020-04-05T05:23:05Z

Updated the description based on the discussion.

dovisutu · 2020-04-24T01:21:53Z

Sorry, but anything going on?

tannergooding · 2020-04-24T01:23:19Z

This hasn't made it to API review yet. Issues that are critical to .NET 5 are being reviewed first and the general backlog is then reviewed from oldest to newest.

dovisutu · 2020-04-24T01:28:12Z

OK, thanks. :D

terrajobst · 2020-07-24T18:20:12Z

Video

API wise, the only change is the NumberStyles addition of HexFloat
Behavior wise, this changes parsing of float and double to support the x format specifier. This isn't a breaking change because float and double throw when x is passed.
We considered adding AllowHexPrefix because integers don't allow 0x as the prefix, while float and double would. However, for floats we want the prefix to be required, so AllowHexPrefix would necessary be a separate bit. Combining it with HexFloat would make it optional. Thus, we can do it later.

namespace System.Globalization
{
    public partial enum NumberStyles
    {
        HexFloat = AllowLeadingWhite | AllowTrailingWhite | AllowLeadingSign | AllowHexSpecifier | AllowDecimalPoint
    }
}

pgovind · 2020-11-20T23:54:58Z

Assigning myself to this.

danmoseley · 2022-03-08T19:57:23Z

@dovisutu are you interested in working on this? If not I will label up for grabs.

dovisutu · 2022-03-12T12:24:37Z

@dovisutu are you interested in working on this? If not I will label up for grabs.

Well, nope, I don't think I have spare time (or adequate abilities?) to work on this. Fine to label as up-for-grabs.

Korporal · 2022-12-03T21:58:51Z

It seems that that regex might be wrong, I found this one that seems to behave correctly:

0[Xx]([0-9a-fA-F]*\.[0-9a-fA-F]+|[0-9a-fA-F]+\.?)[Pp][-+]?[0-9]+[flFL]?

From here: https://efxa.org/2014/05/10/regular-expressions-for-matching-data-values-in-compiler-lexers/

tannergooding · 2022-12-06T17:22:47Z

It's possible its incorrect, but the ultimate logic won't be using a RegEx anyways

Korporal · 2022-12-06T18:47:25Z

It's possible its incorrect, but the ultimate logic won't be using a RegEx anyways

OK, is that because of performance considerations?

stephentoub · 2022-12-06T19:00:36Z

Performance is certainly a factor, but the biggest reason is float/double live in System.Private.CoreLib, which can't depend on System.Text.RegularExpressions.dll

Korporal · 2022-12-06T22:56:06Z

Performance is certainly a factor, but the biggest reason is float/double live in System.Private.CoreLib, which can't depend on System.Text.RegularExpressions.dll

That's quite interesting, thanks for explaining!

Dotnet-GitSync-Bot added area-System.Runtime untriaged New issue has not been triaged by the area owner labels Jan 11, 2020

tannergooding added api-suggestion Early API idea and discussion, it is NOT ready for implementation area-System.Numerics and removed area-System.Runtime untriaged New issue has not been triaged by the area owner labels Jan 11, 2020

tannergooding added api-ready-for-review and removed api-suggestion Early API idea and discussion, it is NOT ready for implementation labels Mar 12, 2020

dovisutu changed the title ~~Add the ability to parse a hexadecimal number into a float/double~~ Add the ability to parse/format a hexadecimal number into a float/double Mar 13, 2020

dovisutu changed the title ~~Add the ability to parse/format a hexadecimal number into a float/double~~ Add the ability to parse/format a float/double from/to a hexadecimal literal Mar 13, 2020

tannergooding added this to the Future milestone Jun 23, 2020

carlossanlop added api-ready-for-review API is ready for review, it is NOT ready for implementation and removed api-ready-for-review labels Jul 6, 2020

terrajobst added api-approved API was approved in API review, it can be implemented and removed api-ready-for-review API is ready for review, it is NOT ready for implementation labels Jul 24, 2020

pgovind self-assigned this Nov 20, 2020

jeffhandley unassigned pgovind Jan 9, 2022

tannergooding added the help wanted [up-for-grabs] Good issue for external contributors label Mar 14, 2022

stephentoub mentioned this issue Apr 5, 2022

Making numerical formatting specifiers more consistent #67623

Closed

tannergooding mentioned this issue Jun 1, 2022

Support more primitive type converters #70057

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add the ability to parse/format a float/double from/to a hexadecimal literal #1630

Add the ability to parse/format a float/double from/to a hexadecimal literal #1630

dovisutu commented Jan 11, 2020 •

edited

Loading

stephentoub commented Jan 11, 2020

Gnbrkm41 commented Jan 11, 2020

tannergooding commented Jan 11, 2020

tannergooding commented Jan 11, 2020

dovisutu commented Mar 12, 2020

dovisutu commented Mar 13, 2020

reflectronic commented Mar 26, 2020

tannergooding commented Mar 26, 2020

dovisutu commented Mar 27, 2020

tannergooding commented Mar 27, 2020

dovisutu commented Mar 28, 2020

dovisutu commented Apr 5, 2020

dovisutu commented Apr 24, 2020

tannergooding commented Apr 24, 2020

dovisutu commented Apr 24, 2020

terrajobst commented Jul 24, 2020 •

edited by dotnet-api-review bot

Loading

pgovind commented Nov 20, 2020

danmoseley commented Mar 8, 2022

dovisutu commented Mar 12, 2022

Korporal commented Dec 3, 2022

tannergooding commented Dec 6, 2022

Korporal commented Dec 6, 2022

stephentoub commented Dec 6, 2022

Korporal commented Dec 6, 2022

Add the ability to parse/format a float/double from/to a hexadecimal literal #1630

Add the ability to parse/format a float/double from/to a hexadecimal literal #1630

Comments

dovisutu commented Jan 11, 2020 • edited Loading

Part of #27204

benefits

proposal APIs

stephentoub commented Jan 11, 2020

Gnbrkm41 commented Jan 11, 2020

tannergooding commented Jan 11, 2020

tannergooding commented Jan 11, 2020

dovisutu commented Mar 12, 2020

dovisutu commented Mar 13, 2020

reflectronic commented Mar 26, 2020

tannergooding commented Mar 26, 2020

dovisutu commented Mar 27, 2020

tannergooding commented Mar 27, 2020

dovisutu commented Mar 28, 2020

dovisutu commented Apr 5, 2020

dovisutu commented Apr 24, 2020

tannergooding commented Apr 24, 2020

dovisutu commented Apr 24, 2020

terrajobst commented Jul 24, 2020 • edited by dotnet-api-review bot Loading

pgovind commented Nov 20, 2020

danmoseley commented Mar 8, 2022

dovisutu commented Mar 12, 2022

Korporal commented Dec 3, 2022

tannergooding commented Dec 6, 2022

Korporal commented Dec 6, 2022

stephentoub commented Dec 6, 2022

Korporal commented Dec 6, 2022

dovisutu commented Jan 11, 2020 •

edited

Loading

terrajobst commented Jul 24, 2020 •

edited by dotnet-api-review bot

Loading