diff --git a/operators/operating-pinot/rebalance/rebalance-servers.md b/operators/operating-pinot/rebalance/rebalance-servers.md index 85996b65..4d9909e8 100644 --- a/operators/operating-pinot/rebalance/rebalance-servers.md +++ b/operators/operating-pinot/rebalance/rebalance-servers.md @@ -155,15 +155,24 @@ Typically, the flags that need to be changed from defaults are {% endhint %} | Query param | Default value | Description | -| -------------------- | ------------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- | +| -------------------- | ------------- |---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | dryRun | false | If set to true, **rebalance is run as a dry-run** so that you can see the expected changes to the ideal state and instance partition assignment. | | includeConsuming | false |

Applicable for REALTIME tables.

CONSUMING segments are rebalanced only if this is set to true.
Moving a CONSUMING segment involves dropping the data consumed so far on old server, and re-consuming on the new server. If an application is sensitive to increased memory utilization due to re-consumption or to a momentary data staleness, they may choose to not include consuming in the rebalance. Whenever the CONSUMING segment completes, the completed segment will be assigned to the right instances, and the new CONSUMING segment will also be started on the correct instances. If you choose to includeConsuming=false and let the segments move later on, any downsized nodes need to remain untagged in the cluster, until the segment completion happens.

| -| downtime | false |

This controls whether Pinot allows downtime while rebalancing.
If downtime = true, all replicas of a segment can be moved around in one go, which could result in a momentary downtime for that segment (time gap between ideal state updated to new servers and new servers downloading the segments).
If downtime = false, Pinot will make sure to keep certain number of replicas (config in next row) always up. The rebalance will be done in multiple iterations under the hood, in order to fulfill this constraint.

Note: If you have only 1 replica for your table, rebalance with downtime=false is not possible.

| +| downtime | false |

This controls whether Pinot allows downtime while rebalancing.
If downtime = true, all replicas of a segment can be moved around in one go, which could result in a momentary downtime for that segment (time gap between ideal state updated to new servers and new servers downloading the segments).
If downtime = false, Pinot will make sure to keep certain number of replicas (config in next row) always up. The rebalance will be done in multiple iterations under the hood, in order to fulfill this constraint.

Note: If you have only 1 replica for your table, see [the section below](#rebalance-with-only-1-replica).

| | minAvailableReplicas | 1 |

Applicable for rebalance with downtime=false.

This is the minimum number of replicas that are expected to stay alive through the rebalance.

| | bestEfforts | false |

Applicable for rebalance with downtime=false.

If a no-downtime rebalance cannot be performed successfully, this flag controls whether to fail the rebalance or do a best-effort rebalance.

| | reassignInstances | false | Applicable to tables where the instance assignment has been persisted to zookeeper. Setting this to true will make the rebalance **first update the instance assignment, and then rebalance the segments**. | | bootstrap | false | Rebalances all segments again, **as if adding segments to an empty table**. If this is false, then the rebalance will try to minimize segment movements. | +### Rebalance with only 1 replica +In general, when the table config uses only 1 replica, the downtime may be affected. +This is why it is not recommended to create tables with only 1 replica in cases where availability is mandatory. +When doing a rebalance with a table whose replica is 1 you have to assume that small downtime windows are possible even if `downtime` is false. + +It is important to note that rebalance may be executed in order to change the replica count. +If that is the case, the important replica number is the final count. +Specifically, when increasing the replica count from 1 to another value with `downtime` set to false, the rebalance will be done with no downtime. + ### Checking status The following API is used to check the progress of a rebalance Job. The API takes the jobId of the rebalance job. The API to see the jobIds of rebalance Jobs for a table is shown next.