packets: discard connection on zero sequence ID #1471
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
MySQL 8 introduced some error packets that can be sent to otherwise idle connections, notably
ER_CLIENT_INTERACTION_TIMEOUT
andER_DISCONNECTING_REMAINING_CLIENTS
. In rare circumstances, or in extremely busy applications, these packets can be sent by the server at the same instant that the client sends a command. This results in the client receiving a packet with a zero sequence ID when trying to read the result of a command.Currently, when the client receives a sequence ID that it doesn't expect then it returns a
ErrPktSync
error and leaves the connection open, since it normally indicates that the user is trying to run a command at the wrong time (e.g. before having finished reading the results of a previous query) rather than the connection having gone bad.However, in the case of these errors, the connection has gone bad, since we expect that the server has closed the connection and there is unread data left in the buffer. Since a connection in this state is not closed then it gets returned to the pool, and when it gets picked up from the pool then many commands will fail with
ErrBusyBuffer
(because the buffer still contains the unread error packet), and it will return to the pool again. This can turn a single error packet into thousands and thousands of errors, as the connections are reused again and again until they eventually timeout.To add some perspective, GitHub recently encountered 146 connections which failed with
ErrPktSync
due to a replica forcibly disconnecting clients (ER_DISCONNECTING_REMAINING_CLIENTS
). These 146 connections resulted in around 1.3 million request failures withErrBusyBuffer
before they timed out within 30 seconds.Ideally, the client should be able to read these error packets and handle them properly, but doing so is a larger project. This change instead makes a relatively small and uninvasive fix to add special handling for zero sequence IDs. With this, when we receive a packet with a zero sequence ID then we always assume that the server has sent a terminal error packet, close the connection, and return
ErrInvalidConn
.Closes #1394.