Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve compare for IntSet and IntMap #1086

Merged
merged 8 commits into from
Jan 30, 2025

Conversation

meooow25
Copy link
Contributor

@meooow25 meooow25 commented Jan 4, 2025

Compare the trees directly instead of converting to lists.
The implementation follows broadly the same approach as the previous attempt in commit 7aff529.

Closes #470, closes #787.

Benchmarks with GHC 9.10:

Map

Name       Time - - - - - - - -    Allocated - - - - -
                A       B     %         A      B     %
compare    167 μs   32 μs  -80%    767 KB    0 B   -100%

Set

Name              Time - - - - - - - -    Allocated - - - - -
                       A       B     %         A      B     %
compare:dense      55 μs  412 ns  -99%    640 KB   12 B   -99%
compare:sparse     87 μs   26 μs  -70%    893 KB    0 B   -100%

@meooow25 meooow25 force-pushed the intset-intmap-compare branch 2 times, most recently from 1395871 to 4b2a3f0 Compare January 5, 2025 02:20
Copy link
Contributor

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm having trouble understanding why we need a fancy result type for the go function. What's going on here?

@meooow25
Copy link
Contributor Author

meooow25 commented Jan 8, 2025

For a lexicographical comparison of the elements it's not enough to use Ordering. Suppose we have

t1 = [0]   [2]
t2 = [0,1] [2]

then comparing the left branches (compare [0] [0,1]) would return LT, but the overall result should be GT.
Detecting whether one set is a prefix of the other lets us handle this correctly.

@treeowl
Copy link
Contributor

treeowl commented Jan 8, 2025

For a lexicographical comparison of the elements it's not enough to use Ordering. Suppose we have
[...]
Oh right ... That makes sense.

@treeowl
Copy link
Contributor

treeowl commented Jan 8, 2025

The constructor names of the result in this context seem rather opaque to me. I assume this is reusing a type from somewhere else? Maybe we should make a new one.

@meooow25
Copy link
Contributor Author

meooow25 commented Jan 9, 2025

The type is introduced in this PR. The constructor names are from the previous attempt (7aff529), I don't mind changing them if you have suggestions.

@meooow25
Copy link
Contributor Author

What names would you prefer @treeowl?

@meooow25 meooow25 mentioned this pull request Jan 10, 2025
8 tasks
@meooow25 meooow25 force-pushed the intset-intmap-compare branch from 97fcae1 to 8203980 Compare January 18, 2025 06:16
@meooow25
Copy link
Contributor Author

Okay, this is good to go.
@treeowl would you still like me to change the names?

Compare the trees directly instead of converting to lists.
The implementation follows broadly the same approach as the previous
attempt in commit 7aff529.
Greatly simplifies the top-level code.
@meooow25 meooow25 force-pushed the intset-intmap-compare branch from 8203980 to 030fb5e Compare January 23, 2025 16:04
Copy link
Contributor

@treeowl treeowl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that's much better, but a bit more documentation is in order.

-- instead and get the same Core.
data Tip' = Tip' {-# UNPACK #-} !Int {-# UNPACK #-} !BitMap

leftmostTipSure :: IntSet -> Tip'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a Haddock string.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? These functions are not exposed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My general philosophy is that every top level function and every non-trivial function should be fully documented. For leftmostTipSure, I recognize that the name really gives it away, but I'm stubborn. orderTips must surely have a documentable purpose, with some expectations about what its arguments will mean and some description of what its result means.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry, I disagree. There is nothing to be gained by adding noise to internal functions with self-explanatory name+type.
If you insist on this please provide the doc strings you would like them to have.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@treeowl do you still want to add doc strings?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll try to write one for compareTips by tomorrow night. But otherwise I guess you can merge and I'll open an issue to remember.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll go ahead and merge it then.

leftmostTipSure (Tip k bm) = Tip' k bm
leftmostTipSure Nil = error "leftmostTipSure: Nil"

orderTips :: Int -> BitMap -> Int -> BitMap -> Order
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a haddock string.

@meooow25 meooow25 merged commit 0d85628 into haskell:master Jan 30, 2025
13 checks passed
@meooow25 meooow25 deleted the intset-intmap-compare branch January 30, 2025 14:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Improve Ord IntSet instance Improve Ord instances for IntSet and IntMap
2 participants