Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stdlib] Optimize normalize_index for unsigned types #3957

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

yinonburgansky
Copy link
Contributor

@yinonburgansky yinonburgansky commented Jan 18, 2025

Use the Indexer trait in normalize_index to optimize for UInt, UInt8, UInt16, UInt32, and UInt64 types.

@yinonburgansky

This comment has been minimized.

@yinonburgansky

This comment has been minimized.

@yinonburgansky

This comment has been minimized.

@yinonburgansky
Copy link
Contributor Author

As a side note:
Currently there is an implicit casting between UInt and Int:

fn main():
    # binary operators use the method of left side type
    print(Int(-1) < UInt(1<<63)) # False 
    print(UInt(1<<63) > Int(-1)) # False
    print(UInt(1) > Int(-1)) # False
    print(Int(-1) < UInt(1)) # True

@yinonburgansky

This comment has been minimized.

@yinonburgansky yinonburgansky force-pushed the normalize_indexer branch 2 times, most recently from 3cbdfbb to 08417d9 Compare January 21, 2025 08:34
@JoeLoser JoeLoser requested a review from jackos January 22, 2025 22:06
@yinonburgansky
Copy link
Contributor Author

yinonburgansky commented Jan 23, 2025

I've found out that by doing the bounds check after the index normalization we can avoid a comparison
and support negative indexing up to UInt.MAX in a performant way.
The compiler is smart enough to optimize the index normalization into branchless index += (index >> 63) & len and the bounds check for Int and UInt are the same.
so the only cost of using negative indexing now is just a few ALU operations.

This is also True for SIMD so if needed we can add support UInt8.MAX sized containers and use negative Int8 indexes without upcasting or doing more expensive bounds check, the only cost is the normalization which can be done branchless.

See https://godbolt.org/z/nMhz6dWhh

@yinonburgansky yinonburgansky force-pushed the normalize_indexer branch 2 times, most recently from 8878984 to 2581aa0 Compare January 23, 2025 08:46
@yinonburgansky yinonburgansky changed the title [stdlib] Use Indexer in normalize_index [stdlib] Optimize normalize_index for unsigned types Jan 23, 2025
@skongum02 skongum02 deleted the branch modular:main January 29, 2025 18:59
@skongum02 skongum02 closed this Jan 29, 2025
@skongum02 skongum02 reopened this Jan 29, 2025
@skongum02 skongum02 changed the base branch from nightly to main January 29, 2025 20:33

@parameter
if (
_type_is_eq[IdxType, UInt]()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This long chain of type comparisons makes me thing we want an UnsignedInteger trait. However, I'd like to see a bounds check that UInt is large enough, ideally a fast path at compile time using .MAX and a slower path at runtime. UInt is large enough for all values right now, so this is a forward-looking change for UInt128 and UInt256 (which cryptography wants).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

added a compile time check for sizeof index <= sizeof length.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed the check since it is wrong, since a big size struct can implement the Indexer trait and return a single property as the index.
By the definition of the current indexer trait it has to be convertible to mlir_index which represent the hardware width integer, so currently we can't really have UInt128 index in 64bit hardware.

)
return i
else:
var i = UInt(index(idx))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This cast should remove values < 0, so the next check is always true.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This does a bitcast to control where we do Signed/Unsigned comparison in the following code.
On the hardware it uses base 2 complement, so addition is the same for signed and unsigned, but comparison is different. It does work, see the unit tests.
I added an intermediate mlir_index to make it more clear.

ContainerType: Sized, //, container_name: StringLiteral
](idx: Int, container: ContainerType) -> Int:
IdxType: Indexer, //, container_name: StringLiteral
](idx: IdxType, length: Int) -> Int:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

length should probably be a UInt in most cases. I can't think of many reasons to have it be and using Int for length gets rid of a lot of address space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So currently there is an implicit cast between Int and UInt and you have to be very careful when mixing them (see example in the comment above for what can go wrong)
The rational for adding another variant is to make working with UInt or Int a decision for the user, the returned index type will be the same as the user container length type.
This open the possibility for SIMD sized container e.g. UInt8 sized container should get back UInt8 type index.
Since currently all stdlib containers are Int sized, we need the Int variant.

@yinonburgansky yinonburgansky force-pushed the normalize_indexer branch 2 times, most recently from 3d86089 to d12ef3d Compare February 1, 2025 17:35
Use the Indexer trait in normalize_index to optimize for UInt, UInt8, UInt16, UInt32, and UInt64 types.

Signed-off-by: Yinon Burgansky <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants