Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a section that explains the downtime with replica 1 #163

Open
wants to merge 1 commit into
base: latest
Choose a base branch
from

Conversation

gortiz
Copy link
Contributor

@gortiz gortiz commented Apr 17, 2023

The description that explains the case when replica is 1 wasn't very clear. Specifically, when increasing the number of replicas.

This PR adds a couple of paragraphs to specifically focus on the replica 1 case.

| minAvailableReplicas | 1 | <p>Applicable for rebalance with downtime=false.</p><p>This is the <strong>minimum number of replicas that are expected to stay alive</strong> through the rebalance.</p> |
| bestEfforts | false | <p>Applicable for rebalance with downtime=false.</p><p>If a no-downtime rebalance cannot be performed successfully, this flag <strong>controls whether to fail the rebalance or do a best-effort rebalance</strong>.</p> |
| reassignInstances | false | Applicable to tables where the instance assignment has been persisted to zookeeper. Setting this to true will make the rebalance **first update the instance assignment, and then rebalance the segments**. |
| bootstrap | false | Rebalances all segments again, **as if adding segments to an empty table**. If this is false, then the rebalance will try to minimize segment movements. |

### Rebalance with only 1 replica
In general, when the table config uses only 1 replica, the downtime may be affected.
Copy link
Contributor

@swaminathanmanish swaminathanmanish Apr 17, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a minor reword if it makes sense.

"The downtime option is not relevant when rebalance is initiated on a table with a replication factor of 1 and there's no further change to the replication factor. In this case, there'll be a downtime during rebalance.

However when the rebalance is initiated for increasing replication (from 1 to > 1), the downtime flag can be used (set to false) to avoid downtime."

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i think with downtime = false and replication = 1 (with no changes to final replication count), the rebalance will itself be blocked. So this would be untrue "In this case, there'll be a downtime during rebalance"

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with the rebalance itself will be blocked?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've recently found this PR is still open. I'm going to apply the change suggested by Manish. About the suggestion from Neha... do we know the actual behavior? IIRC I didn't actually tested that and the written here is what I inferred seen the code, but I may be wrong.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@npawar, is this one good to merge re @gortiz comment above?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants