Skip to content

Commit

Permalink
Copyedits
Browse files Browse the repository at this point in the history
  • Loading branch information
linas committed Jul 16, 2022
1 parent b4c303b commit 46a2d31
Showing 1 changed file with 20 additions and 19 deletions.
39 changes: 20 additions & 19 deletions link-grammar/parse/count.c
Original file line number Diff line number Diff line change
Expand Up @@ -933,42 +933,42 @@ static Count_bin do_count(int lineno, count_context_t *ctxt,
/*
* See do_parse() for the purpose of this function.
*
* The returned number of parses (called here "count") is a 32-bit
* integer. However, this count may sometimes be very big - much more than
* can be represented in 32-bits. In such a case it is just enough to know
* that such an "overflow" occurred. Internally, big counts are clamped to
* The returned number of parses (called "count" here) is a 32-bit
* integer. However, this count may sometimes be very big - much larger than
* can be represented in 32-bits. In such a case, it is just enough to know
* that such an "overflow" occurred. Internally, large counts are clamped to
* INT_MAX (2^31-1) - see parse_count_clamp() (we refer below to such
* values as "clamped"). If the top-level do_count() (the one that is
* called from do_parse()) returns this value, it means such an overflow
* called from do_parse()) returns this value, it means that an overflow
* has occurred.
*
* The function uses a 64-bit signed integer as a count accumulator - named
* "total". The maximum value it can hold is 2^63-1. If it becomes greater
* than INT_MAX, it is considered as a count overflow. A care should be
* taken that this total itself would not overflow, else this detection
* mechanism would be rendered useless. To that end, each value from which
* this total is computed should be small enough so it would not overflow.
* than INT_MAX, it is considered as a count overflow. Care must be
* taken that this total itself does not overflow, else this detection
* mechanism would malfunction. To that end, each value from which
* this total is computed must be small enough so it does not overflow.
*
* The function has 4 code sections to calculate the count. Each of them,
* when entered, returns a value which is clamped (or doesn't need to be
* clamped). The are marked in the code with "Path 1a", "Path 1b",
* "Path 2", and "Path 3".
*
* Path 1a, Path 1b: If there is a possible linkage between the given
* words, return 1, else return 0. Here a count overflow cannot occur.
* words, return 1, else return 0. Here, a count overflow cannot occur.
*
* Path 2: The total accumulate the result of the do_count() invocations
* Path 2: The total accumulates the result of the do_count() invocations
* that are done in a loop. The upper bound on the number of iterations is
* twice (out loop) the maximum number of word disjuncts )inner loop).
* twice (outer loop) the maximum number of word disjuncts (inner loop).
* Assuming no more than 2^31 disjuncts per word, and considering that
* each value is a result of do_count() which is clamped, the total is
* less than (2*2^31)*(2^31`-1), which is less than 2^63-1, and hence just
* needs to be clamped before returning.
*
* Path 3: The total is calculated as a sum of series of multiplications.
* To prevent its overflow, we ensure that each term (including the total
* itself) would not be greater than INT_MAX (2^31-1), so the result will
* not be more than (2^31-1)+((2^31-1)*(2^31-1)) which is less than
* itself) would not be greater than INT_MAX (2^31-1). Then the result will
* not be more than (2^31-1)+((2^31-1)*(2^31-1)), which is less than
* 2^63-1. In this path, each multiplication term that may be greater then
* INT_MAX (leftcount and rightcount) is clamped before the
* multiplication, and the total is clamped after the multiplication.
Expand Down Expand Up @@ -1419,11 +1419,12 @@ static Count_bin do_count(
* we know that the true total is zero. So we don't
* bother counting the other term at all, in that case. */

/* To enable 31-bit overflow detection, total, leftcount and
* rightcount are signed 64-bit, and are , a clamped cached
* value, or are clamped below before they are used. total is
* initially 0 and is clamped at the end of each iteration.
* So the result will not be more than (2^31-1)+((2^31-1)*(2^31-1))
/* To enable 31-bit overflow detection, `total`, `leftcount`
* and `rightcount` are signed 64-bit, and are clamped cached
* values, or are clamped below before they are used. `total`
* is initially 0 and is clamped at the end of each iteration.
* So the result will never be more than
* (2^31-1)+((2^31-1)*(2^31-1)),
* which is less than 2^63-1. */
if (leftpcount &&
(!lcnt_optimize || rightpcount || (0 != hist_total(&l_bnr))))
Expand Down

0 comments on commit 46a2d31

Please sign in to comment.