-
Notifications
You must be signed in to change notification settings - Fork 162
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Table::last performance cliff for low cache size #870
Comments
Ah yes, the implementation needs to be optimized. Creating a range does two lookups (beginning and end of range), which is why it's less efficient than |
Did you close the database and re-open it, when you ran this test? And if so, did you do that for both I think there are two issues, one is cache management but it affects both The second issue is that |
Clean database, though this seems to heavily depend on how you read. When you read while inserting, the performance cliff will happen. If you let it write all 250k items, then do first(), last(), get(0), get(LAST), it will perform much better:
|
Ah, I found the issue. There was a race in the tracking of the remaining read cache space. Thanks for the report! |
I don't know if this is expected performance behaviour, but I found this during benchmarking and it may have pretty serious implications about data-to-cache ratio and read performance for specific workloads.
Basically, the benchmark is just writing small items in monotonic order and reading the last one (Table::last). It will probably happen with other operations such as ::first etc. too.
Depending on the item count & cache size, at some point the read performance will absolutely fall off a cliff.
The cache is set to 4 MB (pretty low), and after 140k (small!) items, the read performance tanks by 6x.
Excerpt:
Even after all (250k) items were written, the read performance does not recover, even though only the very last part of the keyspace is read. I wouldn't really expect that, as per Db stats, the tree height is only 4, so realistically only a couple of 4K pages are needed to get to any leaf page, right? To prove my point, waiting for everything to be written, and replacing
db.last()
withdb.get(LAST_ITEM_ID)
, reads 20x faster. So it kind of confirms my theory that this cache size is high enough to handle the tree traversal quickly, but specifically range operations seem to degrade somehow.If you increase the cache to 8MB, it can maybe pull off 400k items, with 16MB it can do 800k, etc. Even though the read pattern is not random at all.
And here's a comparative benchmark, even though it's hard to compare (I ran redb multiple times just to be sure)... Sled uses tons of memory, as it doesn't seem to obey the set cache size. LMDB just caches in memory, so it's basically cheating too. Fjall is an LSM-tree and can just jump to the correct block very easily, so its cache size can be very low (probably <1 MB) in this scenario for basically an infinite amount of items.
Here's a reproduction script: https://gist.github.com/marvin-j97/8f3ec281bfeeebec2532580a51d741b3
The text was updated successfully, but these errors were encountered: