Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Start instrumenting existing code with state machines #678

Merged
merged 27 commits into from
Oct 7, 2024

Conversation

cole-miller
Copy link
Contributor

This PR introduces several state machines to track and instrument the behavior of existing dqlite code. Specifically, we instrument:

  • Write transactions on the leader (leader__exec)
  • Requests to append a COMMAND entry to the raft log (raft_apply)
  • AppendEntries processing on the follower (replicationAppend)
  • Individual in-memory log entries (log.c)
  • Disk write requests for entries (UvAppend)
  • On-disk log truncation (UvTruncate)

There are lots more things we could instrument but I think this is a good starting point: it gives us enough visibility to follow a write transaction over its whole lifecycle and down to the disk I/O level, on both the follower and the leader. (A big missing piece is linking the histories across nodes; that's nontrivial because our raft messages don't include any kind of ID or room for extensibility, although we could fake something in the I/O fixture.)

The tracking of in-memory log entries is the trickiest part of this PR. I was initially uncertain whether to attach SMs to individual log entries at all, but I found this served as a convenient "hub" to connect other state machines together (e.g. bridging the raft_apply and UvAppend state machines), plus it gives a foothold for tracking how long it takes to apply each entry.

I also added state machines for the append and truncate requests in the raft I/O fixture (raft/fixture.c). This was necessary because the code in raft/replication.c now assumes that the I/O backend makes an SM available to call sm_relate with. The SMs in the fixture are not copies of the ones in the uv I/O backend, but simpler ones.

Finally, I made a few fixes and additions to the sm code.

Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
This fires on all our invocations of vsnprintf, but only with clang (gcc
makes an exception for variadic functions presumably for just this
reason). I think the value of this lint for us is not worth its price in
\#pragma verbosity.

Signed-off-by: Cole Miller <[email protected]>
Copy link

codecov bot commented Aug 5, 2024

Codecov Report

Attention: Patch coverage is 81.85185% with 49 lines in your changes missing coverage. Please review.

Project coverage is 81.07%. Comparing base (4e328c9) to head (85e494d).
Report is 31 commits behind head on master.

Files with missing lines Patch % Lines
src/raft/replication.c 65.06% 22 Missing and 7 partials ⚠️
src/lib/sm.c 57.14% 0 Missing and 6 partials ⚠️
src/raft/client.c 76.19% 4 Missing and 1 partial ⚠️
src/raft/log.c 92.00% 1 Missing and 3 partials ⚠️
src/raft/fixture.c 93.18% 0 Missing and 3 partials ⚠️
src/raft/uv_truncate.c 95.23% 1 Missing ⚠️
test/raft/lib/heap.c 0.00% 0 Missing and 1 partial ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##           master     #678      +/-   ##
==========================================
+ Coverage   74.21%   81.07%   +6.86%     
==========================================
  Files         195      197       +2     
  Lines       27738    29164    +1426     
  Branches     2794     4066    +1272     
==========================================
+ Hits        20585    23644    +3059     
+ Misses       4827     3875     -952     
+ Partials     2326     1645     -681     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Overloading the err_after_request_alloc label resulted in a potential
uninitialized use of `j`.

Signed-off-by: Cole Miller <[email protected]>
@cole-miller
Copy link
Contributor Author

cole-miller commented Aug 5, 2024

Pictures!

Leader's view of executing a write transaction in a one-node cluster (./integration-test cluster/restart):

leader-exec

From top to bottom, we have the leader__exec request, the raft_apply request, the new log entry, and the UvAppend request. You can see how long it took to executing the transaction initially, how long it took to write it to disk, and how long it took to apply it once committed.

Follower's view of handling an AppendEntries (./integration-test cluster/dataOnNewNode):

append-follower

At the top is the main appendFollower request. Below that are 10 log entries, and at the bottom is the disk write for the entries.

Finally, a different view of AppendEntries handling on the follower's side. This one uses the fixture (./raft-core-integration-test replication/recvRollbackConfigurationToInitial), so the I/O related state machines are simpler.

follower-truncate

Here, the follower truncated an entry from its log because of the AppendEntries. You see the appendFollower request at the top, then the truncated entry, then the disk truncate, then the two entries that were included in the AppendEntries, then the disk append. You can see that the truncate and append operations are concurrent---replicationApply doesn't wait for the former to finish before kicking off the latter (something I hadn't noticed before I put together this PR). If we used the real I/O backend and instrumented a bit more we'd see that UvTruncate sets a UvBarrier which blocks the UvAppend from running until the truncate has finished.

@cole-miller cole-miller requested a review from just-now August 7, 2024 11:30
@cole-miller cole-miller mentioned this pull request Sep 2, 2024
Copy link
Contributor

@just-now just-now left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The patch looks good to me. Just minor changes are needed.

EXEC_NR,
};

static const struct sm_conf exec_states[EXEC_NR] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just wanted to leave this code here to show that it's possible to define states in a more compact way:

#define _S(name, flags, allowed)      \
        [name] = {                    \
                .flags   = flags,  \
                .name    = #name,  \
                .allowed = allowed \
        }

        _S(INIT,   SM_INITIAL, BITS(INIT, DONE)),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure. I experimented with switching the state machine definitions over to this style, and it works well for some of them, but once you have more than a few allowed transitions out of some state the lines get long enough to need breaking, and at that point you're more or less back to the beginning. I've pushed a commit that rewrites the definitions in this style, let me know what you think.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not pushing for this style, but it's my personal preference.

src/lib/sm.c Outdated Show resolved Hide resolved
src/raft/fixture.c Outdated Show resolved Hide resolved
src/raft/replication.c Outdated Show resolved Hide resolved
src/raft/fixture.c Outdated Show resolved Hide resolved
src/raft/log.c Outdated
Comment on lines 306 to 307
if (ref == NULL && !collision) {
return NULL;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial logic looks a bit different. !collision looks to be a new element in this case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed, the two return values are unnecessarily hard to reason about so I switched to using a non-NULL sentinel for the collision case, which allows hewing more closely to the original code structure.

src/raft/replication.c Outdated Show resolved Hide resolved
Signed-off-by: Cole Miller <[email protected]>
realloc does not promise to return NULL when the size argument is 0.

Signed-off-by: Cole Miller <[email protected]>
Copy link
Contributor

@just-now just-now left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

EXEC_NR,
};

static const struct sm_conf exec_states[EXEC_NR] = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not pushing for this style, but it's my personal preference.

m->is_locked = is_locked;
m->id = ++id;
m->pid = getpid();
*m = (struct sm){
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks cool as it's easier to grep out initialization with grep "(struct sm){".

if (rv != 0) {
goto err;
index = start;
for (unsigned i = 0; i < n; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the reason of removing logAppendCommands() helper is just cleaner code or I'm missing something?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's just a cleanup, yes.

Comment on lines -607 to -627
if (n > 0) {
struct raft_entry *entries;

/* Create a new array of entries holding the non-truncated
* entries */
entries = raft_malloc(n * sizeof *entries);
if (entries == NULL) {
return RAFT_NOMEM;
}
memcpy(entries, io->entries, n * sizeof *io->entries);

/* Release any truncated entry */
if (io->entries != NULL) {
size_t i;
for (i = n; i < io->n; i++) {
raft_free(io->entries[i].buf.base);
}
raft_free(io->entries);
}
io->entries = entries;
} else {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see that if { } statement goes to ioFlushTruncate(). I wasn't able to identify where else {} statement goes in the new code. Could you clarify this?

Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
Signed-off-by: Cole Miller <[email protected]>
@cole-miller cole-miller merged commit 81eeab5 into canonical:master Oct 7, 2024
12 of 13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants