Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add symmetric difference ops #1009

Merged
merged 1 commit into from
Aug 3, 2024
Merged

Conversation

meooow25
Copy link
Contributor

@meooow25 meooow25 commented Jun 6, 2024

...for Set, Map, IntSet, IntMap. This joins the set operations already implemented: union, intersection, difference.

Closes #1002.

...for Set, Map, IntSet, IntMap. This joins the set operations already
implemented: union, intersection, difference.
@meooow25
Copy link
Contributor Author

@treeowl any concerns about adding these operations?

symmetricDifference :: Ord a => Set a -> Set a -> Set a
symmetricDifference :: Ord k => Map k a -> Map k a -> Map k a
symmetricDifference :: IntSet -> IntSet -> IntSet
symmetricDifference :: IntMap a -> IntMap a -> IntMap a

@jwaldmann
Copy link
Contributor

jwaldmann commented Jul 29, 2024

I agree these might be useful.

It's still an extra maintenance burden. (I am not a maintainer, and I am eternally grateful to those who are, but do I worry about their resources.)

Can these functions not be implemented with https://hackage.haskell.org/package/containers-0.7/docs/Data-Map-Strict.html#v:mergeWithKey (and similar)? Did you do, did you compare performance?

If the answer would be that mergeWithKey doesn't work (well enough), then that seems to me a problem worth fixing (or at least, documenting) - there, or in GHC's inliner perhaps.

@treeowl
Copy link
Contributor

treeowl commented Jul 29, 2024

@jwaldmann , I would look to mergeA instead. Is it powerful enough to implement this? I suspect so. Does it perform well enough for this? That's less clear to me.

@meooow25
Copy link
Contributor Author

meooow25 commented Jul 29, 2024

It's still an extra maintenance burden.

I don't think that is the case. These are straightforward implementations. We already have union, intersection, difference implemented in this manner, and symmetricDifference completes the list.

Can these functions not be implemented with https://hackage.haskell.org/package/containers-0.7/docs/Data-Map-Strict.html#v:mergeWithKey (and similar)? Did you do, did you compare performance?

I would look to mergeA instead.

These operations would more likely be used for sets (as mentioned in #1002), and there is no merge/mergeA function for Set and IntSet.

For Map and IntMap, yes symmetricDifference could be implemented with merge/mergeA. I have not compared the performance, but from a manual inspection it can only be slower. This is so even if we assume that inlining and subsequent optimizations work out well. For instance,

  • Map's mergeA uses splitLookup which allocates and returns a Maybe a. This can be avoided in symmetricDifference by using splitMember because we have no use for that a.
    go (Bin _ kx x1 l1 r1) t2 = case splitLookup kx t2 of
  • IntMap's mergeA is quite involved and is much larger than a direct implementation code-size wise. It also has some branches such as
    | k1 < k2 = liftA2 (subdoubleton k1 k2) (g1k k1 x1) (g2k k2 x2)
    {-
    = link_ k1 k2 <$> subsingletonBy g1k k1 x1 <*> subsingletonBy g2k k2 x2
    -}
    | otherwise = liftA2 (subdoubleton k2 k1) (g2k k2 x2) (g1k k1 x1)

    | signBranch p = liftA2 (flip (bin p)) b a
    | otherwise = liftA2 (bin p) a b

    which are unnecessary for symmetricDifference.

@treeowl
Copy link
Contributor

treeowl commented Jul 29, 2024

I don't object to adding these. I do want to get mergeA for sets, but there's no way I'll get to it myself anytime very soon.

@meooow25
Copy link
Contributor Author

meooow25 commented Aug 2, 2024

@treeowl should we merge this then?

@treeowl
Copy link
Contributor

treeowl commented Aug 2, 2024

Yeah, I'd say go ahead.

@meooow25 meooow25 merged commit 8b4f520 into haskell:master Aug 3, 2024
10 checks passed
@meooow25 meooow25 deleted the symmetric-difference branch August 3, 2024 18:02
@treeowl
Copy link
Contributor

treeowl commented Aug 3, 2024

* `IntMap`'s `mergeA` is quite involved and is much larger than a direct implementation code-size wise. It also has some branches such as
  https://github.com/haskell/containers/blob/edc0e149479dbd0d5f41115c109dba699561ee0e/containers/src/Data/IntMap/Internal.hs#L2072-L2076
  
  https://github.com/haskell/containers/blob/edc0e149479dbd0d5f41115c109dba699561ee0e/containers/src/Data/IntMap/Internal.hs#L2109-L2110
  
  which are unnecessary for `symmetricDifference`.

If those branches are a performance issue, we should probably rewrite merge to avoid them; they're only needed for non-commutative functors.

@meooow25
Copy link
Contributor Author

meooow25 commented Aug 4, 2024

That's true, we need them in mergeA but merge could be implemented without them.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Symmetric difference for sets
3 participants