-
Notifications
You must be signed in to change notification settings - Fork 180
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cleanup after IntMap rewrite #698
Comments
We may want to revisit the recent decision to make
|
@gereeter, please also let me know how you wish to be credited in the changelog and release announcement. |
I keep wondering whether making the tree go entirely in order is really the right way. The obvious alternative is for the order to alternate from level to level, so that keys go to (say) the left whenever they're closer to the bound stored in the current node, whether that's a lower bound or an upper bound. That would (potentially) go along with alternating whether to flip the sign of the bound when storing it, and flipping the sign of the key on each step down. I don't know if that would actually work or not, but it strikes me as plausible. Edit: the sign flipping would be done using |
My understanding is that the new In consequence I believe that we don't lose anything by keeping |
We don't get the weird order reversal of children, but we now have values in internal nodes, and their positions in the order depend on their level. To go strictly in order from a node whose bound is a maximum, we have to push the right subtree and the bound's associated value onto the stack before descending the left subtree. I don't know for sure if we can improve matters or not, but we might save a little by dealing with those values "as they come" to the extent we can. That surely applies to |
My latest commits that have benchmark data just used the output of
Tricky: I actually did this back when the work was out-of-tree, and although I'm sure with time (and newer GHC) one could press the optimizations more and make different design choices, it was slower than stock
And #653. I was putting it off as non-essential, but especially since
The two big documentation blocks I was working to get done before merging were about deletions and merges, and
Aw, darn. I was wrapping my documentation at 100 characters. That at least is easy to fix. While there are definitely easy-to-wrap lines, I'm not so sure about some of the worst offenders. Maybe make an effort pre-merge but don't worry about it as a blocker?
I vaguely remember finding the trivial two-pass algorithm surprisingly faster that a specialized implementation? But it has been a while, and my memory is fuzzy. This will probably be pre-merge just because I'm curious now.
Pre-merge:
Post-merge:
|
That's why I implemented |
Additions for "before merge":
Can someone estimate what kind of speedup we can expect from this? If it's, say, < 20%, I'd say it can wait.
+0.5 for adding documentation before the merge – just in case @gereeter gets hit by a bus.
+0.1 for doing this before the merge in order to reduce noise in
+0.1 as above
This can easily be delayed IMHO |
Yeah, that's fine.
The most conservative approach, I think, is to match
#653 isn't merged yet because it's not a pure win and I froze up trying to make a decision. Your
You've definitely gone a long way toward getting documentation in shape. Thanks.
Hard-to-wrap lines are definitely a low priority.
The generic version will always be better in the "key absent, don't insert" case, because we don't have anything like a fast
I'll add those. |
Thanks.
With the previous representation, the performance improvement was small, but the code was (in my opinion) much easier to understand.
Of course it can, but it shouldn't take long to check and reducing the enormous quantity of source code would be very nice. |
What if the stored bounds in nodes are all minima, and the passed in ones are all maxima, with complements being taken (of bounds and keys) as necessary to maintain this? Ignoring the top level for a moment, lookup would go like this: We have the stored minimum, the passed maximum, and the key. We Would that work, @gereeter? I'd really love to cut down on the code size here, but my intuition for the structure is not yet wonderful. |
Yes, technically, but I'd expect a large performance hit for needlessly complementing all over the place. And while I have the bias of understanding what is going on now, I think it would be a lot more confusing, too. The biggest thing to do for cutting down on code size is unifying functions that do similar things ( (On that note: local functions currently don't have type signatures for the most part because of the lack of
Anything in particular? Any confusion is an opportunity for better documentation in my mind. |
I'd be surprised. Basic operations like
That seems a much more likely problem. But the source duplication of the current scheme is also a potential source of errors and confusion. It seems to take some time to figure out what flops and how.
That certainly seems a promising approach to the source duplication, though it does nothing for object code size.
Let's keep those gated for now … the current approach to scoped type variables is controversial, to say the least, with both Eisenberg and Kmett preferring a different sort. Again, a macro in
I think it could be helpful to have a few diagrams demonstrating basic operations, but those can be tough in text. Feel free to include something separate in HTML+SVG, LaTeX, or whatever, as long as it's not too big. But that can certainly wait till after the merge. |
Ah, that could be a problem for reducing duplication. Without type signatures, GHC won't infer maximally polymorphic types for local definitions and we can't use |
You may need to tie it to the type family usage to avoid a nasty |
Other random API "omission": |
Yes, we should consider that, ideally with input from the libraries list. I wouldn't call it a priority myself. |
I'd like to clarify something: the reason I care about object code size is not that it takes up space on disk or in RAM, but that it takes up space in cache and involves more code in each operation. This can have important performance consequences. I played around a bit and got -- @P C@ tags a value as representing itself;
-- @C P@ tags a value as representing its complement.
data P
data C
-- A key tagged to indicate whether it represents itself
-- or its complement.
newtype TKey t u = TKey {getKey :: Key}
deriving (Eq, Ord)
-- Each node stores the minimum value of that subtree. That may actually be
-- a *maximum* when considering what's represented.
newtype IntMap a = IntMap (IntMap_ P C a) deriving (Eq, Show)
data IntMap_ t u a
= NonEmpty {-# UNPACK #-} !(TKey t u) a !(Node u t a) | Empty deriving (Eq)
data Node t u a = Bin {-# UNPACK #-} !(TKey t u) a !(Node u t a) !(Node t u a) | Tip deriving (Eq)
deriving instance Show a => Show (IntMap_ P C a)
deriving instance Show a => Show (IntMap_ C P a)
deriving instance Show a => Show (Node P C a)
deriving instance Show a => Show (Node C P a)
-- This type is just used for convenience when defining the Show instances for TKey.
data KeyRep = PKey !Key | CKey !Key deriving Show
instance Show (TKey P C) where
showsPrec p (TKey k) = showsPrec p (PKey k)
instance Show (TKey C P) where
showsPrec p (TKey k) = showsPrec p (CKey (complement k))
-- | Take the complement of a key
compKey :: TKey t u -> TKey u t
compKey (TKey k) = TKey (complement k)
xor :: TKey t u -> TKey t u -> Word
xor (TKey k) (TKey b) = fromIntegral $ Bits.xor k b
lookup :: forall a. Key -> IntMap a -> Maybe a
lookup k0 (IntMap m0) = start (TKey k0) m0
where
start :: TKey t u -> IntMap_ t u a -> Maybe a
start !k Empty = Nothing
start k (NonEmpty min minV node)
| k < min = Nothing
| k == min = Just minV
| otherwise = go (compKey k) (xor k min) node
go :: TKey t u -> Word -> Node t u a -> Maybe a
go !k !_ Tip = Nothing
go k xorCache (Bin min minV l r)
| k > min = if xorCache < xorCacheMin
then go k xorCache r
else go (compKey k) xorCacheMin l
| k < min = Nothing
| otherwise = Just minV
where xorCacheMin = xor k min
insert :: forall a. Key -> a -> IntMap a -> IntMap a
insert k0 v (IntMap m0) = IntMap $ start (TKey k0) m0
where
start :: TKey t u -> IntMap_ t u a -> IntMap_ t u a
start k Empty = NonEmpty k v Tip
start k (NonEmpty minK minV root)
| k == minK = NonEmpty k v root
| k > minK = NonEmpty minK minV $ go (xor k minK) (compKey minK) (compKey k) root
| otherwise = NonEmpty k v $ insertMaxN (xor k minK) (compKey minK) minV root
go :: Word -> TKey t u -> TKey t u -> Node t u a -> Node t u a
go !xorCache !maxK !k Tip = Bin k v Tip Tip
go xorCache maxK k (Bin minK minV l r)
| minK < k =
if xorCache < xorCacheMin
then Bin minK minV l (go xorCache maxK k r)
else Bin minK minV (go xorCacheMin (compKey minK) (compKey k) l) r
| k < minK =
if xor minK maxK < xorCacheMin
then Bin k v Tip (Bin minK minV l r)
else Bin k v (insertMaxN xorCacheMin (compKey minK) minV l) r
| otherwise = Bin minK v l r
where
xorCacheMin :: Word
xorCacheMin = xor k minK
insertMaxN :: Word -> TKey t u -> a -> Node t u a -> Node t u a
insertMaxN !xorcache k v Tip = Bin k v Tip Tip
insertMaxN xorcache k v (Bin minK minV l r)
| xor k minK < xorcache = Bin minK minV (Bin (compKey k) v r l) Tip
| otherwise = Bin minK minV l (insertMaxN xorcache k v r) The traverseWithKey
:: forall f a b. Applicative f
=> (Key -> a -> f b) -> IntMap a -> f (IntMap b)
traverseWithKey _ (IntMap Empty) = pure empty
traverseWithKey f (IntMap (NonEmpty k v root)) =
liftA2 (\v' root' -> IntMap (NonEmpty k v' root')) (f (getKey k) v) (goC root)
where
goC :: Node C P a -> f (Node C P b)
goC Tip = pure Tip
goC (Bin k v l r) = liftA3 (\r' l' v' -> Bin k v' l' r') (goC r) (goP l) (f (complement (getKey k)) v)
goP :: Node P C a -> f (Node P C b)
goP Tip = pure Tip
goP (Bin k v l r) = liftA3 (Bin k) (f (getKey k) v) (goC l) (goP r) |
I can't figure out what LLVM is doing with this code.... I wonder if it actually expands it to yours, suggesting maybe I'm wasting my time. But benchmarking will tell, some day, maybe. |
@treeowl Could you explain a bit how you can use Also, have you actually tested your code? I'm just stumped how it could possibly work. |
@sjakobi The general gist is that we always store the minimum in each node, but that may ultimately represent its complement, depending on how many left branches we've taken (with the initial jump down considered a left branch). |
@sjakobi, here's my draft validity tester: valid :: (Eq a, Show a) => IntMap a -> Property
valid m0@(IntMap m0_) =
-- This will catch errors very reliably, but it's coarse-grained
fromList (toList m0) === m0 .&&.
-- This catches errors in a fine-grained way, but relies on being implemented
-- right.
start m0_
where
start :: IntMap_ P C a -> Property
start Empty = property ()
start (NonEmpty _minK _minV Tip) = property ()
start (NonEmpty minK _minV (Bin maxK _maxV l r)) =
go maxK (compKey minK) r .&&.
go minK (compKey maxK) l
go :: TKey t u -> TKey t u -> Node t u a -> Property
go xmin xmax _
| xmin >= xmax = counterexample "max not greater" False
go _ _ Tip = property ()
go xmin xmax (Bin minK _minV l r) =
xor xmin minK > xor xmax minK .&&.
counterexample "not between" (xmin < minK .&&. minK < xmax) .&&.
go (compKey xmax) (compKey minK) l .&&.
go minK xmax r |
Thanks to @gereeter's recent cleanups and explanations, I was finally able to adapt Complementy version of {-# INLINE compareMinBound #-}
compareMinBound :: TKey t u -> TKey t u -> BoundOrdering
compareMinBound k min
| k > min = InBound
| k < min = OutOfBound
| otherwise = Matched
data BoundOrdering = InBound | OutOfBound | Matched deriving (Eq) Complementy deletion: delete :: forall a. Key -> IntMap a -> IntMap a
delete !_ (IntMap Empty) = IntMap Empty
delete !k0 m@(IntMap (NonEmpty min minV root)) = case compareMinBound k min of
InBound -> IntMap (NonEmpty min minV (deleteNode (compKey k) (xor k min) root))
OutOfBound -> m
Matched -> IntMap (nodeToCompMap root)
where k = TKey k0
deleteNode :: TKey t u -> Word -> Node t u a -> Node t u a
deleteNode !_ !_ Tip = Tip
deleteNode !k !xorCache n@(Bin min minV l r) = case compareMinBound k min of
InBound | xorCache < xorCacheMin -> Bin min minV l (deleteNode k xorCache r)
| otherwise -> Bin min minV (deleteNode (compKey k) xorCacheMin l) r
OutOfBound -> n
Matched -> extractBinR l r
where xorCacheMin = xor k min
extractBinR :: Node u t a -> Node t u a -> Node t u a
extractBinR Tip r = r
extractBinR (Bin max maxV innerL innerR) r =
let NE min minV l = deleteMaxNode max maxV innerL innerR
in Bin (compKey min) minV l r
deleteMaxNode :: TKey t u -> a -> Node u t a -> Node t u a -> NonEmptyIntMap_ t u a
deleteMaxNode !min minV Tip Tip = NE min minV Tip
deleteMaxNode !min minV (Bin max maxV l r) Tip = NE (compKey max) maxV (Bin min minV r l)
deleteMaxNode !min minV l (Bin innerMin innerMinV innerL innerR) =
let NE max maxV inner = deleteMaxNode innerMin innerMinV innerL innerR
in NE max maxV (Bin min minV l inner)
nodeToCompMap :: Node t u a -> IntMap_ u t a
nodeToCompMap Tip = Empty
nodeToCompMap (Bin min minV innerL innerR) =
let NE max maxV r = deleteMaxNode min minV innerL innerR
in NonEmpty (compKey max) maxV r
data NonEmptyIntMap_ t u a = NE {-# UNPACK #-} !(TKey t u) a !(Node t u a) That's just about half as much code. |
Quick benchmarks of the complementing code compared to not:
I just copied your code into a project, added |
The "original" is your code for #340, right? |
Yes. |
Do you think you could upload whatever you used to run that benchmark so I can play with it? |
I find the size of these differences you report rather shocking, especially for deletion hit. |
src/IntMap.hs{-# LANGUAGE BangPatterns, RankNTypes, StandaloneDeriving, FlexibleInstances, ScopedTypeVariables #-}
module IntMap where
import Data.Bits (complement)
import qualified Data.Bits as Bits
import qualified Data.List as List
import Control.DeepSeq (NFData(..))
type Key = Int
-- @P C@ tags a value as representing itself;
-- @C P@ tags a value as representing its complement.
data P
data C
-- A key tagged to indicate whether it represents itself
-- or its complement.
newtype TKey t u = TKey {getKey :: Key}
deriving (Eq, Ord)
-- Each node stores the minimum value of that subtree. That may actually be
-- a *maximum* when considering what's represented.
newtype IntMap a = IntMap (IntMap_ P C a) deriving (Eq, Show)
data IntMap_ t u a
= NonEmpty {-# UNPACK #-} !(TKey t u) a !(Node u t a) | Empty deriving (Eq)
data Node t u a = Bin {-# UNPACK #-} !(TKey t u) a !(Node u t a) !(Node t u a) | Tip deriving (Eq)
data NonEmptyIntMap_ t u a = NE {-# UNPACK #-} !(TKey t u) a !(Node t u a)
deriving instance Show a => Show (IntMap_ P C a)
deriving instance Show a => Show (IntMap_ C P a)
deriving instance Show a => Show (Node P C a)
deriving instance Show a => Show (Node C P a)
instance NFData a => NFData (IntMap a) where
rnf (IntMap m) = rnf m
instance NFData a => NFData (IntMap_ t u a) where
rnf Empty = ()
rnf (NonEmpty _ v root) = rnf v `seq` rnf root
instance NFData a => NFData (Node t u a) where
rnf Tip = ()
rnf (Bin _ v l r) = rnf v `seq` rnf l `seq` rnf r
-- This type is just used for convenience when defining the Show instances for TKey.
data KeyRep = PKey !Key | CKey !Key deriving Show
instance Show (TKey P C) where
showsPrec p (TKey k) = showsPrec p (PKey k)
instance Show (TKey C P) where
showsPrec p (TKey k) = showsPrec p (CKey (complement k))
-- | Take the complement of a key
compKey :: TKey t u -> TKey u t
compKey (TKey k) = TKey (complement k)
xor :: TKey t u -> TKey t u -> Word
xor (TKey k) (TKey b) = fromIntegral $ Bits.xor k b
empty = IntMap Empty
fromList :: [(Key, a)] -> IntMap a
fromList = List.foldl' (flip (uncurry insert)) (IntMap Empty)
fromAscList = fromList
fromDistinctAscList = fromList
lookup :: forall a. Key -> IntMap a -> Maybe a
lookup k0 (IntMap m0) = start (TKey k0) m0
where
start :: TKey t u -> IntMap_ t u a -> Maybe a
start !k Empty = Nothing
start k (NonEmpty min minV node)
| k < min = Nothing
| k == min = Just minV
| otherwise = go (compKey k) (xor k min) node
go :: TKey t u -> Word -> Node t u a -> Maybe a
go !k !_ Tip = Nothing
go k xorCache (Bin min minV l r)
| k > min = if xorCache < xorCacheMin
then go k xorCache r
else go (compKey k) xorCacheMin l
| k < min = Nothing
| otherwise = Just minV
where xorCacheMin = xor k min
insert :: forall a. Key -> a -> IntMap a -> IntMap a
insert k0 v (IntMap m0) = IntMap $ start (TKey k0) m0
where
start :: TKey t u -> IntMap_ t u a -> IntMap_ t u a
start k Empty = NonEmpty k v Tip
start k (NonEmpty minK minV root)
| k == minK = NonEmpty k v root
| k > minK = NonEmpty minK minV $ go (xor k minK) (compKey minK) (compKey k) root
| otherwise = NonEmpty k v $ insertMaxN (xor k minK) (compKey minK) minV root
go :: Word -> TKey t u -> TKey t u -> Node t u a -> Node t u a
go !xorCache !maxK !k Tip = Bin k v Tip Tip
go xorCache maxK k (Bin minK minV l r)
| minK < k =
if xorCache < xorCacheMin
then Bin minK minV l (go xorCache maxK k r)
else Bin minK minV (go xorCacheMin (compKey minK) (compKey k) l) r
| k < minK =
if xor minK maxK < xorCacheMin
then Bin k v Tip (Bin minK minV l r)
else Bin k v (insertMaxN xorCacheMin (compKey minK) minV l) r
| otherwise = Bin minK v l r
where
xorCacheMin :: Word
xorCacheMin = xor k minK
insertMaxN :: Word -> TKey t u -> a -> Node t u a -> Node t u a
insertMaxN !xorcache k v Tip = Bin k v Tip Tip
insertMaxN xorcache k v (Bin minK minV l r)
| xor k minK < xorcache = Bin minK minV (Bin (compKey k) v r l) Tip
| otherwise = Bin minK minV l (insertMaxN xorcache k v r)
{-# INLINE compareMinBound #-}
compareMinBound :: TKey t u -> TKey t u -> BoundOrdering
compareMinBound k min
| k > min = InBound
| k < min = OutOfBound
| otherwise = Matched
data BoundOrdering = InBound | OutOfBound | Matched deriving (Eq)
delete :: forall a. Key -> IntMap a -> IntMap a
delete !_ (IntMap Empty) = IntMap Empty
delete !k0 m@(IntMap (NonEmpty min minV root)) = case compareMinBound k min of
InBound -> IntMap (NonEmpty min minV (deleteNode (compKey k) (xor k min) root))
OutOfBound -> m
Matched -> IntMap (nodeToCompMap root)
where k = TKey k0
deleteNode :: TKey t u -> Word -> Node t u a -> Node t u a
deleteNode !_ !_ Tip = Tip
deleteNode !k !xorCache n@(Bin min minV l r) = case compareMinBound k min of
InBound | xorCache < xorCacheMin -> Bin min minV l (deleteNode k xorCache r)
| otherwise -> Bin min minV (deleteNode (compKey k) xorCacheMin l) r
OutOfBound -> n
Matched -> extractBinR l r
where xorCacheMin = xor k min
extractBinR :: Node u t a -> Node t u a -> Node t u a
extractBinR Tip r = r
extractBinR (Bin max maxV innerL innerR) r =
let NE min minV l = deleteMaxNode max maxV innerL innerR
in Bin (compKey min) minV l r
deleteMaxNode :: TKey t u -> a -> Node u t a -> Node t u a -> NonEmptyIntMap_ t u a
deleteMaxNode !min minV Tip Tip = NE min minV Tip
deleteMaxNode !min minV (Bin max maxV l r) Tip = NE (compKey max) maxV (Bin min minV r l)
deleteMaxNode !min minV l (Bin innerMin innerMinV innerL innerR) =
let NE max maxV inner = deleteMaxNode innerMin innerMinV innerL innerR
in NE max maxV (Bin min minV l inner)
nodeToCompMap :: Node t u a -> IntMap_ u t a
nodeToCompMap Tip = Empty
nodeToCompMap (Bin min minV innerL innerR) =
let NE max maxV r = deleteMaxNode min minV innerL innerR
in NonEmpty (compKey max) maxV r app/Main.hsmodule Main where
import Control.DeepSeq (rnf)
import Control.Exception (evaluate)
import Gauge (bench, defaultMain, whnf)
import Data.List (foldl')
import qualified IntMap as M
import Data.Maybe (fromMaybe)
import Prelude hiding (lookup)
main = do
let m = M.fromAscList elems :: M.IntMap Int
evaluate $ rnf [m]
evaluate $ rnf missKeys
defaultMain
[ bench "lookup hit" $ whnf (lookup keys) m
, bench "lookup miss" $ whnf (lookup missKeys) m
, bench "insert empty" $ whnf (ins elems) M.empty
, bench "delete hit" $ whnf (del keys) m
, bench "delete miss" $ whnf (del missKeys) m
, bench "fromList" $ whnf M.fromList elems
, bench "fromAscList" $ whnf M.fromAscList elems
, bench "fromDistinctAscList" $ whnf M.fromDistinctAscList elems
]
where
elems = zip keys values
keys = [1,3..2^13]
missKeys = [0,2..2^13]
values = [1,3..2^13]
sum k v1 v2 = k + v1 + v2
consPairL xs k v = (k, v) : xs
consPairR k v xs = (k, v) : xs
add3 :: Int -> Int -> Int -> Int
add3 x y z = x + y + z
{-# INLINE add3 #-}
lookup :: [Int] -> M.IntMap Int -> Int
lookup xs m = foldl' (\n k -> fromMaybe n (M.lookup k m)) 0 xs
ins :: [(Int, Int)] -> M.IntMap Int -> M.IntMap Int
ins xs m = foldl' (\m (k, v) -> M.insert k v m) m xs
data PairS a b = PS !a !b
del :: [Int] -> M.IntMap Int -> M.IntMap Int
del xs m = foldl' (\m k -> M.delete k m) m xs
maybeDel :: Int -> Maybe Int
maybeDel n | n `mod` 3 == 0 = Nothing
| otherwise = Just n Bench.cabalcabal-version: 1.12
name: Bench
version: 0.1.0.0
build-type: Simple
library
exposed-modules:
IntMap
hs-source-dirs:
src
ghc-options: -O2
build-depends:
base >=4.7 && <5
, deepseq
default-language: Haskell2010
executable Bench-exe
main-is: Main.hs
hs-source-dirs:
app
ghc-options: -O2
build-depends:
Bench
, base >=4.7 && <5
, deepseq
, gauge
default-language: Haskell2010 stack.yaml
|
Thanks! That's very helpful. I'm going to play with it a bit later. One thing I want to do is see how the native code generator compares to LLVM for both implementations. A very brief look earlier suggested they were doing somewhat different things with the complements, but I don't understand what LLVM was doing. |
I've improved the complementy code somewhat, but I still can't match yours, particularly for insert :: forall a. Key -> a -> IntMap a -> IntMap a
insert k0 v0 (IntMap m0) = IntMap $ start (TKey k0) v0 m0
where
start :: TKey t u -> a -> IntMap_ t u a -> IntMap_ t u a
start !k v Empty = NonEmpty k v Tip
start !k v (NonEmpty minK minV root) = case compareMinBound k minK of
Matched -> NonEmpty k v root
InBound -> NonEmpty minK minV $ go (xor k minK) (compKey minK) (compKey k) v root
OutOfBound -> NonEmpty k v $ insertMaxN (xor k minK) (compKey minK) minV root
go :: Word -> TKey t u -> TKey t u -> a -> Node t u a -> Node t u a
go !_xorCache !_maxK !k v Tip = Bin k v Tip Tip
go xorCache maxK k v (Bin minK minV l r) = case compareMinBound k minK of
InBound
| xorCache < xorCacheMin
-> Bin minK minV l (go xorCache maxK k v r)
| otherwise
-> Bin minK minV (go xorCacheMin (compKey minK) (compKey k) v l) r
OutOfBound
| xor minK maxK < xorCacheMin
-> Bin k v Tip (Bin minK minV l r)
| otherwise
-> Bin k v (insertMaxN xorCacheMin (compKey minK) minV l) r
Matched -> Bin minK v l r
where
xorCacheMin :: Word
xorCacheMin = xor k minK
insertMaxN :: Word -> TKey t u -> a -> Node t u a -> Node t u a
insertMaxN xorcache k v Tip = Bin k v Tip Tip
insertMaxN xorcache k v (Bin minK minV l r)
| xor k minK < xorcache = Bin minK minV (Bin (compKey k) v r l) Tip
| otherwise = Bin minK minV l (insertMaxN xorcache k v r)
delete :: forall a. Key -> IntMap a -> IntMap a
delete !_ (IntMap Empty) = IntMap Empty
delete !k0 m@(IntMap (NonEmpty min minV root)) = case compareMinBound k min of
InBound -> IntMap (NonEmpty min minV (deleteNode (compKey k) (xor k min) root))
OutOfBound -> m
Matched -> IntMap (nodeToCompMap root)
where k = TKey k0
deleteNode :: TKey t u -> Word -> Node t u a -> Node t u a
deleteNode !_ !_ Tip = Tip
deleteNode !k !xorCache n@(Bin min minV l r) = case compareMinBound k min of
InBound | xorCache < xorCacheMin -> Bin min minV l (deleteNode k xorCache r)
| otherwise -> Bin min minV (deleteNode (compKey k) xorCacheMin l) r
OutOfBound -> n
Matched -> extractBin l r
where xorCacheMin = xor k min
extractBin :: Node u t a -> Node t u a -> Node t u a
extractBin Tip r = r
extractBin (Bin max maxV innerL innerR) r =
let NE min minV l = deleteMaxNode max maxV innerL innerR
in Bin (compKey min) minV l r
deleteMaxNode :: TKey t u -> a -> Node u t a -> Node t u a -> NonEmptyIntMap_ t u a
deleteMaxNode !min minV m1 Tip = case m1 of
Tip -> NE min minV Tip
Bin max maxV l r -> NE (compKey max) maxV (Bin min minV r l)
deleteMaxNode !min minV l (Bin innerMin innerMinV innerL innerR) =
let NE max maxV inner = deleteMaxNode innerMin innerMinV innerL innerR
in NE max maxV (Bin min minV l inner)
nodeToCompMap :: Node t u a -> IntMap_ u t a
nodeToCompMap Tip = Empty
nodeToCompMap (Bin min minV innerL innerR) =
let NE max maxV r = deleteMaxNode min minV innerL innerR
in NonEmpty (compKey max) maxV r
data NonEmptyIntMap_ t u a = NE {-# UNPACK #-} !(TKey t u) a !(Node t u a) |
I wonder whether the smaller, Unfortunately I haven't found any applications that make heavy use of Another reason why I've been slightly worried about #340 is that the increased amount of code ultimately also offers more potential for (performance) bugs. I'm also not a big fan of the tendency to reduce lines of code via extra function arguments that must be inlined by the compiler for proper performance. IMHO that makes the code less readable and more fickle with regards to performance and compiler changes… |
@sjakobi the extra arguments are indeed unpleasant, though I'm not terribly worried about fragility in most cases. I really don't like having any more source code duplication than necessary. What we'd really like is probably something like lookup# :: Key -> IntMap a -> (# (##) | a #)
lookup k m = case lookup# k m of
(# | a #) -> Just a
_ -> Nothing
{-# INLINE lookup #-}
member k m = case lookup# k m of
(# | _ #) -> True
_ -> False But the compat story is something of a mess and will surely involve double-barreled continuations anyway. We still might want to do it though. |
I just had a radically simple idea that definitely has its own trade-offs but might be worth considering if it can actually be implemented easily. data IntMap a
= Bin
{ _minK :: !Int
_minV :: a
_lft :: !(IntMap a)
_maxK :: !Int
_maxV :: a
_rght :: !(IntMap a) }
| Tip
| Single !Int a This is quite an enormous
|
I made this a while ago: https://gitlab.imn.htwk-leipzig.de/waldmann/containers-benchmark |
@treeowl Could you demonstrate a bit how this structure works? How do minimum and maximum keys relate to those in the children? @jwaldmann That looks very useful! We should try to get that into the To build this with a custom
and ran
(The benchmark executable is the Comparing 205e2a8 (the current HEAD commit in #340) against #340's base commit f7e27e6, the new Profiling seems to put the blame mostly on Interestingly, allocations for The profiles are attached: base-O2.prof.txt, gereeter-O2.prof.txt. |
Despite the name, this is also testing IntMap: https://github.com/jwaldmann/containers/blob/intset%3Dword/containers-tests/benchmarks/IntSet.hs#L53 |
@jwaldmann This also looks very interesting. It would be very nice if you could make a PR for your benchmark patches! |
It occurs to me that |
Maybe. I don't see any benchmarks in the repo though. |
|
on using fgl via rdf: yes - but in their current state, rdf benchmarks take ages, because of
I am not seeing any IntMap (but some HashMap). |
I may be misunderstanding something about how the profiling works, but I'm very concerned to see |
I just realized that neither of those functions has |
Overly aggressive cost-center profiling (e.g., |
After slapping some `master` branch
`direct-bounded-intmap` branch
Now, I'm not sure I got everything right and I wasn't running on a particularly quiet machine, but at least I didn't see any
|
Re: I think there are two ways of using profiling info:
Manually adding cost centers seems quite heavy. Statistical profiling would work without instrumentation? But what's the status? https://gitlab.haskell.org/ghc/ghc/wikis/dwarf/status#statistical-profiling |
I just finished this myself and patched up the last holes that I found. Could someone else double check? |
This ticket summarizes things that should be done either before or soon after the enormous
IntMap
replacement in #340.Before merge
Reinstate the export list in
Data.IntMap.Internal
.Add an export list to
Data.IntMap.Merge.Internal
.Diff Haddocks to make sure no functions or instances have gone missing.
Generate a final benchmark comparison. I prefer to see these interleaved or side by side rather than one whole set after another, but I won't insist on it. This should go in the commit message for the final squashed commit.
Try a bit harder to optimize
union
.See what non-merge functions can be defined in terms of merge tactics and
runMissingAll
. (More would require a restructuring of the module boundaries.)Improve test coverage, especially of merge tactics and
isSubmapOf
. (Side question: should we bump up the number of test cases Travis asks for?)Add internal documentation for merges and deletion.
Reinstate rewrite rules and the phased
INLINE
andNOINLINE
directives needed to make them work. Many of them (e.g., map fusion rules and map/coerce) are unconditional improvements. Others (relating to list conversions) are likely desirable.Check that GHC can use the new IntMap without (undue) performance degradation. (@sjakobi has offered to trigger the build jobs for this.)
See if anything can be done about the object code duplication resulting from the alternating levels. Is there some clever way to make them uniform enough to share code? (Sadly, the answer appears to be no, for reasons that are not entirely obvious. Flipping over to the
complement
of the key each time we go left seems to hurt lookups significantly, although not enormously.)After merge
See if anything can be done about the source code duplication resulting from the alternating levels. Is there some clever way to make them uniform enough to share code?
Rewrite
IntSet
to match. Hopefully this is fairly mechanical, replacing values by bitmaps. Once this is done, get rid of bit twiddling utilities that are no longer used.Pick up Speed up fromList for IntMap #653 and see how well it works with the new representation.
Optimize
merge
.Resolve the strictness guarantees of
fromAscList
and friends. See Settle on a consistent strictness forfromList
andfromAscList
#473 and Rewrite Data.IntMap to be faster and use less memory #340 (comment).Consider using more type wrappers for safety.
Evaluate using unboxed sums for intermediates. These are not always a win, but sometimes are.
Expand the benchmarks to cover more functions. Check for major regressions that the current benchmarks missed.
Consider adding as permanent APIs things that came up in the implementation:
runMissingAll :: WhenMissing f a b -> IntMap a -> f (IntMap b)
, plus maybe a pure variant. (DF: I'm not so convinced yet. What's it for?)traverseMaybeWithKey :: (Key -> a -> f (Maybe b)) -> IntMap a -> f (IntMap b)
(DF: yes, we most definitely want this by some name or other.)dropMatched :: WhenMatched f a b c
firstMatched :: WhenMatched f a b a
(andsecondMatched :: WhenMatched f a b b
)fold
variants that can ignore or be otherwise loose with ordering. (see Cleanup after IntMap rewrite #698 (comment), but more generally)Uncategorized
Check whether the restructuring in Improve fromAscList and friends. #658 is a good idea for the new representation and, if so, reinstate it.
Continue to add internal documentation. For example,
isSubmapOf
and the like are pretty confusing.Shorten long lines when reasonable. We generally aim for 80 characters or fewer. There's flexibility for unusual situations, but this code has some very long lines that could be broken cleanly.
Use
liftA*
inmergeA
.Evaluate whether we can get the same performance using
foldMapDefault
,fmapDefault
, and equivalent tricks for keyed versions. That would cut down on source code.Evaluate whether we should implement
alter
"by hand" as we used to. If not, then we should probably implement it usingalterF @Identity
.The text was updated successfully, but these errors were encountered: