Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Batched lookups in IO #102

Merged
merged 2 commits into from
Apr 23, 2024
Merged

Batched lookups in IO #102

merged 2 commits into from
Apr 23, 2024

Conversation

jorisdral
Copy link
Collaborator

No description provided.

@jorisdral jorisdral self-assigned this Feb 21, 2024
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch 10 times, most recently from 84ce2c5 to cb744e8 Compare February 21, 2024 19:57
@jorisdral jorisdral changed the base branch from jdral/use-rawbytes-in-rawpage to main February 21, 2024 20:06
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch from cb744e8 to 3473825 Compare February 21, 2024 20:06
@jorisdral
Copy link
Collaborator Author

jorisdral commented Feb 21, 2024

GHA fails on ghc's before 9.6. It should be fixed after incorporating well-typed/quickcheck-lockstep#19

Relevant GHA run: https://github.com/input-output-hk/lsm-tree/actions/runs/7994908218/job/21834042035?pr=102

@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch 6 times, most recently from ac4589f to 13da88d Compare March 21, 2024 10:53
Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a few thoughts, since I encountered many similar situations with run merging.

src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Entry.hs Outdated Show resolved Hide resolved
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch 2 times, most recently from cf4912f to c6fc7c8 Compare April 1, 2024 16:54
@jorisdral jorisdral changed the base branch from main to jdral/lookups-using-vectors April 1, 2024 16:54
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch 3 times, most recently from 6d096f1 to 62c96b6 Compare April 2, 2024 14:44
@jorisdral jorisdral force-pushed the jdral/lookups-using-vectors branch from fcdf61e to de472c9 Compare April 2, 2024 14:55
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch from 62c96b6 to f10c992 Compare April 2, 2024 14:56
@jorisdral jorisdral force-pushed the jdral/lookups-using-vectors branch from 0d63c5f to 0a09e81 Compare April 5, 2024 12:00
Base automatically changed from jdral/lookups-using-vectors to main April 5, 2024 12:32
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch from f10c992 to 1e14a05 Compare April 5, 2024 15:18
@jorisdral jorisdral changed the base branch from main to jdral/submitio-using-vectors April 5, 2024 16:01
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch from 1e14a05 to 5befdd0 Compare April 5, 2024 16:01
@jorisdral jorisdral force-pushed the jdral/submitio-using-vectors branch from b328e2d to eb82d81 Compare April 9, 2024 07:47
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch from 5befdd0 to 65dd89b Compare April 9, 2024 08:14
@jorisdral jorisdral changed the title WIP: batched lookups in IO Batched lookups in IO Apr 9, 2024
@jorisdral jorisdral marked this pull request as ready for review April 9, 2024 08:14
@jorisdral jorisdral requested a review from dcoutts as a code owner April 9, 2024 08:14
Base automatically changed from jdral/submitio-using-vectors to main April 9, 2024 08:17
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch 2 times, most recently from 6f9675b to 503a350 Compare April 9, 2024 15:08
Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(not done yet, just want these comments to be visible already)

src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
Copy link
Collaborator

@dcoutts dcoutts left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm part way through. Not looked at the tests yet.

src/Database/LSMTree/Internal/Entry.hs Outdated Show resolved Hide resolved
bench/micro/Bench/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch 5 times, most recently from 2469762 to 32d5938 Compare April 19, 2024 08:54
Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of my comments are:

  1. suggestions for the benchmark, which can still be done later, or you could even move the whole benchmark to a follow-up PR
  2. notes and ideas on the main lookup code, which you might like or not, in which case you can safely discard them

So most things are not a blocker, but before merging it would be good to at least add some TODOs and ideally avoid the export of Run.close.

bench/micro/Bench/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
bench/micro/Bench/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
bench/micro/Bench/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
bench/micro/Bench/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Show resolved Hide resolved
src/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
src/Database/LSMTree/Internal/Run.hs Outdated Show resolved Hide resolved
lookupss = concatMap lookups dats

opaqueifyBlobs :: V.Vector (k, Maybe (Entry v b)) -> V.Vector (k, Maybe (Entry v Opaque))
opaqueifyBlobs = fmap (fmap (fmap (fmap Opaque)))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

peak Haskell 🏆

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait until you meet their cousin...

opaqueifyBlobs :: V.Vector (k, Maybe (Entry v b)) -> V.Vector (k, Maybe (Entry v Opaque))
opaqueifyBlobs = getCompose . getCompose . getCompose . fmap Opaque . Compose . Compose . Compose

test/Test/Database/LSMTree/Internal/Lookup.hs Outdated Show resolved Hide resolved
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch from 32d5938 to c9ad667 Compare April 22, 2024 15:28
* New lookupInBatches function that prepare
  lookups using prepLookups, submits them to a
  HasBlockIO interface, extracts results from
  pages, and combines them.
* Adapt micro-benchmarks to include
  lookupsInBatches, in addition to prepLookups,
  indexSearches and bloomQueries.
* Add round-trip test for lookupInBatches
* Change prepLookups code to receive multiple
  vectors, instead of a vector of sums.
* Remove unused uniformWithoutReplacement'
  function
@jorisdral jorisdral force-pushed the jdral/batched-lookups-IO branch from c9ad667 to 575c796 Compare April 22, 2024 15:32
Copy link
Collaborator

@mheinzel mheinzel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks ready to me!

@jorisdral jorisdral added this pull request to the merge queue Apr 23, 2024
Merged via the queue into main with commit c0a5751 Apr 23, 2024
12 checks passed
@jorisdral jorisdral deleted the jdral/batched-lookups-IO branch April 23, 2024 09:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants