Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Handle many more intrinsics in Bounds.cpp #7823

Merged
merged 11 commits into from
Dec 1, 2023
Merged

Conversation

steven-johnson
Copy link
Contributor

This addresses many (but not all) of the signed integer overflow issues we're seeing in Google due to #7814 -- a lot of the issues seems to be in code that uses intrinsics that had no handling in value bounds checking, so the bounds were naively large and overflowed.

  • Most of the intrinsics from FindIntrinsics.h weren't handled; now they all are (most by lowering to other IR, though the halving_add variants were modeled directly because the bitwise ops don't mesh well)
  • strict_float() is just a pass-through
  • round() is a best guess (basically, if bounds exist, expand by one as a worst-case)

There are definitely others we should handle here... trunc/floor/ceil probably?

This addresses many (but not all) of the `signed integer overflow` issues we're seeing in Google due to #7814 -- a lot of the issues seems to be in code that uses intrinsics that had no handling in value bounds checking, so the bounds were naively large and overflowed.

- Most of the intrinsics from FindIntrinsics.h weren't handled; now they all are (most by lowering to other IR, though the halving_add variants were modeled directly because the bitwise ops don't mesh well)
- strict_float() is just a pass-through
- round() is a best guess (basically, if bounds exist, expand by one as a worst-case)

There are definitely others we should handle here... trunc/floor/ceil probably?
@steven-johnson steven-johnson requested a review from abadams August 29, 2023 23:43
src/Bounds.cpp Outdated
} else if (op->is_intrinsic(Call::strict_float)) {
internal_assert(op->args.size() == 1);
interval = arg_bounds.get(0);
} else if (op->is_intrinsic(Call::round)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Round was already handled below more tightly than this (you can just round the min and max). strict_float should probably be handled in the same place. i.e. we should probably preserve the strict_float wrapper around the min and max so that floating point optimizations don't happen to the min and max in a way that makes them no longer contain the original expression.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I missed the Call::round handling below -- I think the case I saw didn't pass the (interval = arg_bounds.get(0)).is_bounded() test and that's why I thought it was missing.

@@ -1468,6 +1480,7 @@ class Bounds : public IRVisitor {
}
} else if (op->args.size() == 1 &&
(op->is_intrinsic(Call::round) ||
op->is_intrinsic(Call::strict_float) ||
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's going to be a merge conflict here because Call::saturating_cast is in the same category. Probably should add it in this PR in case the other one doesn't go in and we revert the u32 -> i32 cast change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I take it back! saturating_cast doesn't belong here.

src/Bounds.cpp Outdated
} else if (op->is_intrinsic(Call::halving_add)) {
// lower_halving_add() uses bitwise tricks that are hard to reason
// about; let's do this instead:
Expr e = narrow((widen(op->args[0]) + widen(op->args[1])) / 2);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at the bot failure, I suspect this is trying to widen a 64-bit input

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, well, ok, but this is literally the fallback implementation for it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I don't know how to make the bitwise op handling robust enough to handle this correctly)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess the right way to handle this is to special-case 64-bit and use the min/max possible

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a branch that (among other things) tacked on bounds inference for many of these intrinsics - I did have to special-case any intrinsic that semantically widens if the arguments are 64 bit, and there were a few that would produce a double-widening so had to be even further special-cases (I think rounding_mul_shift_right lowers to a widening mul followed by a rounding shift right that lowers to a widening add or something like that, so it would double-widen

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have a branch

please share! These changes make for much better bounds inference in some cases (esp pipelines with fixed-point math); if your fixes are better than these we should take yours.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I need to severely clean it up - that's part of why I never opened a PR. I can try to clean it up and share, might take me a few days unfortunately - I am about to be traveling for a funding thing.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if it's ugly, feel free to put it somewhere I can look at it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

// about; let's do this instead:
if (op->type.bits() == 64) {
bounds_of_type(t);
} else if (op->is_intrinsic(Call::widen_right_add)) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think we need this many safety checks for the widening operations. Any Expr in a widening op needs to be able to be widened - we can't lift to widening_mul unless a user widened the inputs. We only need to be careful with operations that we can lift to without widening operations, but that the "simple" lowering pattern involves widening.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, yes and no -- it's true a that the Exprs in a widening op need to be widened, and well-formed code shouldn't pass us cases that don't fit; that said, we absolutely will get misuse in that way, so what should we do when that happens? IMHO we are better off checking for it an explicitly devolving to bounds-of-type, rather than risking that the bounds-calc code makes a mistake and calculates an inappropriate bound due to inadvertent overflow.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I think I misunderstood the use case. Is this for when users write code that produces a LUT index, and uses intermediate 64 bit types?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this for when users write code that produces a LUT index

Yes? I mean, we have no control of what the user is doing with these functions; they could pass it insane nonsense, so we need to be somewhat defensive here. We'd prefer to avoid a too-loose bounds, but we absolutely cannot risk getting too-tight bounds.

@steven-johnson
Copy link
Contributor Author

@rootjalex -- obviously this didn't get landed and I was out all last week; when you get a chance, we should put our heads together to figure out how to combine our approaches.

@rootjalex
Copy link
Member

@rootjalex -- obviously this didn't get landed and I was out all last week; when you get a chance, we should put our heads together to figure out how to combine our approaches.

I am absolutely swamped this week, and have a 9/19 deadline. Happy to discuss at the dev meeting (I should be able to attend this week, I hope...), or sometime after 9/19. Sorry about that

@steven-johnson
Copy link
Contributor Author

Hey @rootjalex, I'm not going to have time to sync up with you on this for a bit -- you are welcome to take over this PR and combine it with your own as you see fit (or ignore it entirely if yours looks better); otherwise this will likely sit unfinished until sometime in November

@steven-johnson
Copy link
Contributor Author

This has been sitting here a while. Where does it stand? Does it need more work?

@abadams
Copy link
Member

abadams commented Nov 28, 2023

I think this one is a partial fix for problems identified in #7814

@steven-johnson
Copy link
Contributor Author

This PR is definitely not a complete fix, but I think it is worthy of landing as a partial fix (pending checking in Google) -- WDYT?

src/Bounds.cpp Outdated
@@ -41,6 +41,36 @@ using std::string;
using std::vector;

namespace {

bool can_widen(const Expr &e) {
return e.type().bits() < 64;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should probably be <= 32. I'm thinking of the 48 bit types in the xtensa backend.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

@rootjalex
Copy link
Member

This PR is definitely not a complete fix, but I think it is worthy of landing as a partial fix (pending checking in Google) -- WDYT?

I think that's fine. Sorry for going AWOL - I had a big deadline recently that took all of my time.

@steven-johnson
Copy link
Contributor Author

Hmm, this injects a lot of signed-integer-overflow failures in google3... I'll need to do some debugging.

@steven-johnson
Copy link
Contributor Author

My implementation of saturating_cast was misguided, I just reverted it entirely for now

@steven-johnson steven-johnson merged commit 4fc2a7d into main Dec 1, 2023
3 checks passed
@steven-johnson steven-johnson deleted the srj/intrinsics-bounds branch December 1, 2023 00:31
ardier pushed a commit to ardier/Halide-mutation that referenced this pull request Mar 3, 2024
* Handle many more intrinsics in Bounds.cpp

This addresses many (but not all) of the `signed integer overflow` issues we're seeing in Google due to halide#7814 -- a lot of the issues seems to be in code that uses intrinsics that had no handling in value bounds checking, so the bounds were naively large and overflowed.

- Most of the intrinsics from FindIntrinsics.h weren't handled; now they all are (most by lowering to other IR, though the halving_add variants were modeled directly because the bitwise ops don't mesh well)
- strict_float() is just a pass-through
- round() is a best guess (basically, if bounds exist, expand by one as a worst-case)

There are definitely others we should handle here... trunc/floor/ceil probably?

* Fix round() and strict_float() handling

* Update Bounds.cpp

* Fixes?

* trigger buildbots

* Revert saturating_cast handling

* Update Bounds.cpp

---------

Co-authored-by: Andrew Adams <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants