-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid double buffering direct IO index input slices with BufferedIndexInput #14103
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for tackling this @ChrisHegarty -- I don't feel qualified to review the code changes too closely (they are / this class is somewhat scary).
It's a neat idea to try this for KNN 2nd phase rescoring, I guess to prevent that (heavy) IO from polluting OS's buffer cache for other hot index pages?
And with a fast enough device (SSD) maybe the uncached IO penalty of retrieving N full precision vectors for rescoring is acceptable latency in the query hot path ... if you can read these N full precision vectors using IO concurrency that might also be a big win, though I think @jpountz's recent cool IO hinting changes rely on OS buffer cache (?) ... but maybe many virtual threads each doing the blocking (O_DIRECT) IO could work somehow?
lucene/misc/src/java/org/apache/lucene/misc/store/DirectIODirectory.java
Show resolved
Hide resolved
|
This commit avoids double buffering direct IO index input slices with
BufferedIndexInput
.Currently
BufferedIndexInput
is used for slicing, since it will handle the initial offset and length, but this adds an extra layer of buffering - the buffer in buffered index input as well as the buffer in direct IO index input. This change reflows direct IO index input so that it can handle an offset and length, so can be its own implementation for slices.Existing tests covered this, but I found case where a clone of a slice was not covered. I added a small change to the base directory test case which covers this.
My motivation for doing this is that I've been investigating the possibility of using direct IO for random access reads of float32 vectors when rescoring an initial set of candidates retrieved from scalar quantized approximations.