Write operations that immediately follow write operations sometimes cause a disk I/O-error, followed by loss of leadership and high latency #522

fbrandherm · 2023-06-13T13:43:23Z

I am using dqlite (version 1.14) for an internal project and I observed some unexpected behavior in my benchmarks (on localhost): If I rapidly spam write-operations (INSERT OR REPLACE INTO kv_table (KEY, VALUE) VALUES (?,?);, using request type 8 of the wire protocol), there are some random latency spikes (see picture) that do not appear, if I wait 1ms between requests. What happens is that these outlier requests return SQLite's "disk I/O error", and retrying the request returns "not leader" for some time. I suspect what happens is that this bug triggers a leader election. The files are on a ramdisk and I cannot reproduce the bug if the files are on an SSD, so the bug is probably timing-related.

Regarding the plot: blue dots are 100 write operations on node 1, red dots are 100 read-operations on node 2 (1st red dot is a leadership transfer to node 2). There were 3 voting nodes in the cluster.

The text was updated successfully, but these errors were encountered:

MathieuBordere · 2023-06-13T13:47:46Z

Can you share your code to reproduce this?

fbrandherm · 2023-06-13T14:25:47Z

Sorry, but I can't share the full code since it's a large project that uses DQLite as a backend behind a lot of other logic and isn't open sourced (yet). I'm sure it could be reproduced by much simpler code, but I don't have the time to implement a simple demo reproducing the bug until the end of the month. I should note however, that my code is using a custom client implemented in C++, which could also make a difference.

MathieuBordere · 2023-06-13T16:48:57Z

No problem, we'll try to reproduce this.

MathieuBordere added Bug Confirmed to be a bug Incomplete Waiting on more information from reporter labels Jun 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Write operations that immediately follow write operations sometimes cause a disk I/O-error, followed by loss of leadership and high latency #522

Write operations that immediately follow write operations sometimes cause a disk I/O-error, followed by loss of leadership and high latency #522

fbrandherm commented Jun 13, 2023

MathieuBordere commented Jun 13, 2023

fbrandherm commented Jun 13, 2023

MathieuBordere commented Jun 13, 2023

Write operations that immediately follow write operations sometimes cause a disk I/O-error, followed by loss of leadership and high latency #522

Write operations that immediately follow write operations sometimes cause a disk I/O-error, followed by loss of leadership and high latency #522

Comments

fbrandherm commented Jun 13, 2023

MathieuBordere commented Jun 13, 2023

fbrandherm commented Jun 13, 2023

MathieuBordere commented Jun 13, 2023