Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Thoughts on Exactness #181

Open
sffc opened this issue Oct 10, 2024 · 9 comments
Open

Thoughts on Exactness #181

sffc opened this issue Oct 10, 2024 · 9 comments

Comments

@sffc
Copy link

sffc commented Oct 10, 2024

The README throws around the word "exact". It discusses how 0.1 as a binary float is not actually 0.1 in memory. It states, "Statistically, most human-authored decimal numbers cannot be exactly represented as a binary floating-point number (AKA float)."

These are true statements.

However, floating point numbers do have a canonical decimal equivalent. This is easy to see, because all decimal numbers with less than approximately 15 significant digits have a unique, one-to-one mapping with a binary float, an equivalence relation. (For this post, I'll assume that f64 has enough significant digits to be sufficient for the use cases Decimal seeks to cover. This constraint can be relaxed, but it is not relevant to the main point of this thread.) This behavior is exposed already in JavaScript today via string conversion.

String(Number("0.12345"))

The algorithms for binary-to-decimal and decimal-to-binary are well researched, efficient, continue to evolve, and are already implemented in major browsers.

What does this mean?

The most common argument I see for the footgun of float arithmetic is this:

0.1 + 0.2
// ==> 0.30000000000000004

I understand that this is surprising behavior. However, there are ways to solve this problem without using decimals as the numeric type.

Rounding More Often

For most operations, it's totally fine and expected to round more often, which will essentially take the binary float and make it the expected decimal equivalent.

This is currently accessible in the language via Number.prototype.toPrecision, which returns the string form of the equivalence relation.

Number((0.1 + 0.2).toPrecision(2)) === 0.3  // true
// 0.3

More developers should do this, and we should improve documentation and introductory computer science classes.

We could even add a helper function that behaves like toPrecision but returns a Number.

Number.prototype.decimalAdd

Alternatively, it is also possible to write a function that works like this:

(0.1).decimalAdd(0.2) === 0.3  // true

or even invent a new operator to make it look nicer

0.1 +~ 0.2 === 0.3  // true

The contract of Number.prototype.decimalAdd is: add the binary floats as if adding their decimal equivalent, and return the binary float corresponding to the result.

I claim that this can be implemented efficiently. If the efficiency of this operation is the crux, I'm happy to go into this further.


My point is that the concept of "exactness" does not imply that Decimal128 is required as a numeric data type. Exactness can be achieved with Float64 in both representation (via the equivalence relation) and arithmetic (via either developer education or helper functions on Number).

@jessealama
Copy link
Collaborator

I agree that the 0.1 + 0.2 === 0.3 example can be worked around. Some initial thoughts:

  • Can this approach be generalized? Is there a general way to get results we want, or do we need to hard-code the desired precision (in this case, the precision is 1e-2)?

  • Does this approach generally work if the number and variety of operations increases? Here, we're adding two floats. Are we sure that this works for adding, say, 10 floats? What if we throw in multiplication or division?

I agree that more education is needed. At a minimum, I hope this GitHub issue gets indexed and JS programmers discover this and can learn something here. There are a number of websites out there that explain what binary floats are and some tips & tricks, such as suitable use of toFixed or toPrecision, to get around the problems.

@waldemarhorwat
Copy link

waldemarhorwat commented Oct 12, 2024

Regarding adding toPrecision intermediate stages: The fundamental problem with this approach is multiple rounding. It's difficult even for experts to analyze code that does multiple stages of rounding. Here's an example:

(1.15).toPrecision(2) returns "1.1"
(1.1 + 0.05).toPrecision(2) returns "1.2"

OK, so you then try this in stages where in the first stage you do toPrecision(3) and in the next stage you do toPrecision(2). But that's still wrong:

Number(1.05.toPrecision(3)).toPrecision(2) returns "1.1"
Number(1.15.toPrecision(3)).toPrecision(2) returns "1.1"
Number(1.25.toPrecision(3)).toPrecision(2) returns "1.3"

so it seems at first glance to do round-ties-to-odd. But wait!

Number(1.35.toPrecision(3)).toPrecision(2) returns "1.4"

Such multiple rounding is a well-known antipattern among folks familiar with numerics. It's to be strongly avoided except in cases where either you prefer fast over correct arithmetic (in which case you're probably using Numbers anyway) or you can construct a proof that it works.

If you're not using a decimal library, best practices are to scale the numbers into integral Numbers or BigInts and do arithmetic on those.

@sffc
Copy link
Author

sffc commented Oct 12, 2024

The concept of "rounding errors" is misplaced. You aren't "rounding" your Number so long as toPrecision has more digits of precision than is intended to be represented. You're "rounding away" binary float artifacts. If done correctly, calling toPrecision at each step is totally safe and well defined. If in doubt, call it with an argument of 15.

@waldemarhorwat
Copy link

That response is factually incorrect. Try my examples replacing the 3 with 15 and you'll still get wrong answers.

@sffc
Copy link
Author

sffc commented Oct 13, 2024

I'm not following what you're trying to say, then. In your post, all of these are identities:

Number(1.05.toPrecision(3))
Number(1.15.toPrecision(3))
Number(1.25.toPrecision(3))

You could have made the same point about how binary floats "sometimes round up, sometimes round down" by writing

1.05.toPrecision(2) returns "1.1"
1.15.toPrecision(2) returns "1.1"
1.25.toPrecision(2) returns "1.3"
1.35.toPrecision(2) returns "1.4"

My point is that any N round-trips through Number(N.toPrecision(Q)) so long as N is in range and has Q <= 15 significant digits, and you can use this fact to safely perform decimal-like arithmetic using binary floats.

@sffc
Copy link
Author

sffc commented Oct 13, 2024

I think the following ultra simple function works for typical use cases:

Number.prototype.decimalAdd = function(operand) {
    return Number((this + operand).toPrecision(15));
}

Note: Despite calling a rounding function at each step, this code prevents the accumulation of "rounding errors" because the toPrecision serves only to remove binary float artifacts.

Observation: Cases such as "0.1 + 0.2" should always return either the expected decimal equivalent number or the number exactly one epsilon up or down. The binary answer should never differ from the decimal answer by more than one epsilon. I do not have a proof for this theorem.

To-do: find something more clever than hard-coding 15, such as deriving the correct precision value from the binary floats.

@sffc
Copy link
Author

sffc commented Oct 13, 2024

What is an "epsilon"? It is the distance between a reference binary float and the binary float that is exactly 1 tick up or down the number line from the reference. For example, to get the next binary float up or down, something like this does the job almost all of the time (exception: when on the boundary between two powers of 2):

Number.prototype.nextFloat = function() {
    let buffer = new ArrayBuffer(8);
    let floatArray = new Float64Array(buffer);
    let intArray = new BigInt64Array(buffer);
    floatArray[0] = this;
    intArray[0] += 1n;
    return floatArray[0];
}

With that function, you can observe the following:

0.1 + 0.2 === 0.3.nextFloat()  // true

Another fun little function: count the number of discrete floats between two numbers (same constraint as above: don't cross a power of 2):

function countFloatsBetween(a, b) {
    let buffer = new ArrayBuffer(16);
    let floatArray = new Float64Array(buffer);
    let intArray = new BigInt64Array(buffer);
    floatArray[0] = a;
    floatArray[1] = b;
    return intArray[1] - intArray[0];
}

Example: take 0.3 and the next decimal number at 15 significant digits:

countFloatsBetween(0.3, 0.300000000000001)
// 18n

// Observe how the number of epsilons is not constant:
[
    countFloatsBetween(0.1, 0.100000000000001),
    countFloatsBetween(0.2, 0.200000000000001),
    countFloatsBetween(0.3, 0.300000000000001),
    countFloatsBetween(0.4, 0.400000000000001),
    countFloatsBetween(0.5, 0.500000000000001),
    countFloatsBetween(0.6, 0.600000000000001),
    countFloatsBetween(0.7, 0.700000000000001),
    countFloatsBetween(0.8, 0.800000000000001),
    countFloatsBetween(0.9, 0.900000000000001),
]
// [72n, 36n, 18n, 18n, 9n, 9n, 9n, 9n, 9n]

// It resets at powers of 10 because we move the rightmost digit to the left:
[
    countFloatsBetween(1, 1.00000000000001),
    countFloatsBetween(9, 9.00000000000001),
    countFloatsBetween(10, 10.0000000000001),
    countFloatsBetween(99, 99.0000000000001),
    countFloatsBetween(100, 100.000000000001),
]
// [45n, 6n, 56n, 7n, 70n]

My little decimalAdd function works because the artifact of binary float arithmetic, which by my theorem is always at most 1 epsilon, is always less than the difference between two adjacent decimals with 15 significant digits.

You can also see how we could be more clever when picking the number of digits when performing the operation. It should be quick to determine whether 15, 16, or 17 digits should be used for a particular number. I picked 15 because I think it always works.


This post should also serve to answer @jessealama's two questions: this approach generalizes up to the capacity of the binary float in decimal space, which is 15-17 significant digits; and yes, this generally works as the number and variety of operations increases.

@sffc
Copy link
Author

sffc commented Oct 25, 2024

I will note that the Number.prototype.decimalAdd function written above achieves basically the same thing as Math.sumPrecise (https://github.com/tc39/proposal-math-sum). If we were to do something like add a higher-level type or a special operator, it can use the same algorithm as Math.sumPrecise.

Actually no, it's not the same problem. Sorry. There is a Test262 which does the opposite of this.

assert.sameValue(Math.sumPrecise([1e308, 1e308, 0.1, 0.1, 1e30, 0.1, -1e30, -1e308, -1e308]), 0.30000000000000004);

So we would need a different algorithm for my decimalAdd function.

@Qyinwtocs
Copy link

Qyinwtocs commented Dec 13, 2024

@sffc The way you implement decimalAdd isn't correct. For the example below,

(1.551).decimalAdd(-1.55); // returns 0.00099999999999989 instead of 0.001

it is supposed to return 0.001, but it is actually 0.00099999999999989, which is 507 binary64 values away from 0.001.
If there is a decimalAdd(x, y) operation for Numbers, I suppose that it will return 𝔽(ToDecimal(x) + ToDecimal(y)), where ToDecimal(x) returns the mathematical value represented by ToString(x).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants