Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

raft: re-enable config change safety #124804

Merged
merged 2 commits into from
May 29, 2024

Conversation

pav-kv
Copy link
Collaborator

@pav-kv pav-kv commented May 29, 2024

Config changes in this raft implementation require a safety constraint: the leader must not append a config change if it hasn't applied all config changes in its log.

The DisableConfChangeValidation flag disables this check under the assumption that the state machine layer provides the equivalent guarantee. However, it is hard to argue that this is true in split leaseholder/leader scenarios.

This commit re-enables this check, to bring the safety back. The other two state-machine-level checks concerned with entering and leaving joint configs can still be disabled.

Related to #105797, etcd-io/raft#81, #106515 (#106515 (comment))

Epic: CRDB-37617
Release note: none

@pav-kv pav-kv requested review from nvanbenschoten and a team May 29, 2024 12:46
Copy link

blathers-crl bot commented May 29, 2024

It looks like your PR touches production code but doesn't add or edit any test code. Did you consider adding tests to your PR?

🦉 Hoot! I am a Blathers, a bot for CockroachDB. My owner is dev-inf.

@cockroach-teamcity
Copy link
Member

This change is Reviewable

@pav-kv pav-kv force-pushed the fix-confchange-safety branch from 711def6 to 7de8650 Compare May 29, 2024 12:48
@pav-kv pav-kv added backport-24.1.x Flags PRs that need to be backported to 24.1. backport-24.1.0-rc labels May 29, 2024
Config changes in this raft implementation require a safety constraint:
the leader must not append a config change if it hasn't applied all
config changes in its log.

The DisableConfChangeValidation flag disables this check under the
assumption that the state machine layer provides the equivalent
guarantee. However, it is hard to argue that this is true in split
leaseholder/leader scenarios.

This commit re-enables this check, to bring the safety back. The other
two state-machine-level checks concerned with entering and leaving joint
configs can still be disabled.

Epic: none
Release note: none
@pav-kv pav-kv force-pushed the fix-confchange-safety branch from 7de8650 to 182fabf Compare May 29, 2024 13:13
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 2 of 2 files at r1, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @pav-kv)


pkg/raft/testdata/confchange_disable_validation.txt line 59 at r1 (raw file):

  1->3 MsgApp Term:1 Log:1/4 Commit:5 Entries:[1/5 EntryConfChangeV2 l2 l3]

# Propose new config change. Note how it isn't rejected,

Do we want to adjust this test to demonstrate the new behavior where even with DisableConfChangeValidation=true, a new config change is rejected if the previous has not been applied yet?

Copy link
Collaborator Author

@pav-kv pav-kv left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (and 1 stale) (waiting on @nvanbenschoten)


pkg/raft/testdata/confchange_disable_validation.txt line 59 at r1 (raw file):

Previously, nvanbenschoten (Nathan VanBenschoten) wrote…

Do we want to adjust this test to demonstrate the new behavior where even with DisableConfChangeValidation=true, a new config change is rejected if the previous has not been applied yet?

Done

This commit adds a test which ensures a config change is not proposed if
the leader has another yet unapplied config change.

Epic: none
Release note: none
Copy link
Member

@nvanbenschoten nvanbenschoten left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:lgtm:

Reviewed 1 of 1 files at r2, all commit messages.
Reviewable status: :shipit: complete! 1 of 0 LGTMs obtained (waiting on @pav-kv)

@pav-kv pav-kv force-pushed the fix-confchange-safety branch from 44e4e55 to 8587815 Compare May 29, 2024 17:37
@pav-kv
Copy link
Collaborator Author

pav-kv commented May 29, 2024

bors r=nvanbenschoten

@craig craig bot merged commit 163a896 into cockroachdb:master May 29, 2024
21 of 22 checks passed
@pav-kv pav-kv deleted the fix-confchange-safety branch May 29, 2024 19:36
pav-kv added a commit to cockroachdb/raft that referenced this pull request Jun 12, 2024
Config changes in this raft implementation require a safety constraint:
the leader must not append a config change if it hasn't applied all
config changes in its log.

The DisableConfChangeValidation flag disables this check under the
assumption that the state machine layer provides the equivalent
guarantee. However, it is hard to argue that this is true in split
leaseholder/leader scenarios.

This commit re-enables this check, to bring the safety back. The other
two state-machine-level checks concerned with entering and leaving joint
configs can still be disabled.

Ported from cockroachdb/cockroach#124804
pav-kv added a commit to cockroachdb/raft that referenced this pull request Jun 12, 2024
This commit adds a test which ensures a config change is not proposed if
the leader has another yet unapplied config change.

Ported from cockroachdb/cockroach#124804
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-24.1.x Flags PRs that need to be backported to 24.1.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants