Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Syncing from upstream patroni/patroni (master) #435

Merged
merged 6 commits into from
Aug 14, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
17 changes: 17 additions & 0 deletions .github/workflows/tests.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -211,3 +211,20 @@ jobs:

- name: Generate documentation
run: tox -m docs

isort:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4

- name: Set up Python 3.12
uses: actions/setup-python@v5
with:
python-version: 3.12
cache: pip

- name: isort
uses: isort/isort-action@master
with:
requirementsFiles: "requirements.txt requirements.dev.txt requirements.docs.txt"
sort-paths: "patroni tests features setup.py"
1 change: 1 addition & 0 deletions docs/ENVIRONMENT.rst
Original file line number Diff line number Diff line change
Expand Up @@ -28,6 +28,7 @@ Log
- **PATRONI\_LOG\_STATIC\_FIELDS**: add additional fields to the log. This option is only available when the log type is set to **json**. Example ``PATRONI_LOG_STATIC_FIELDS="{app: patroni}"``
- **PATRONI\_LOG\_MAX\_QUEUE\_SIZE**: Patroni is using two-step logging. Log records are written into the in-memory queue and there is a separate thread which pulls them from the queue and writes to stderr or file. The maximum size of the internal queue is limited by default by **1000** records, which is enough to keep logs for the past 1h20m.
- **PATRONI\_LOG\_DIR**: Directory to write application logs to. The directory must exist and be writable by the user executing Patroni. If you set this env variable, the application will retain 4 25MB logs by default. You can tune those retention values with `PATRONI_LOG_FILE_NUM` and `PATRONI_LOG_FILE_SIZE` (see below).
- **PATRONI\_LOG\_MODE**: Permissions for log files (for example, ``0644``). If not specified, permissions will be set based on the current umask value.
- **PATRONI\_LOG\_FILE\_NUM**: The number of application logs to retain.
- **PATRONI\_LOG\_FILE\_SIZE**: Size of patroni.log file (in bytes) that triggers a log rolling.
- **PATRONI\_LOG\_LOGGERS**: Redefine logging level per python module. Example ``PATRONI_LOG_LOGGERS="{patroni.postmaster: WARNING, urllib3: DEBUG}"``
Expand Down
106 changes: 54 additions & 52 deletions docs/citus.rst
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@ There are only a few simple rules you need to follow:

After that you just need to start Patroni and it will handle the rest:

0. Patroni will set ``bootstrap.dcs.synchronous_mode`` to :ref:`quorum <quorum_mode>`
if it is not explicitly set to any other value.
1. ``citus`` extension will be automatically added to ``shared_preload_libraries``.
2. If ``max_prepared_transactions`` isn't explicitly set in the global
:ref:`dynamic configuration <dynamic_configuration>` Patroni will
Expand Down Expand Up @@ -77,36 +79,36 @@ It results in two major differences in :ref:`patronictl` behaviour when
An example of :ref:`patronictl_list` output for the Citus cluster::

postgres@coord1:~$ patronictl list demo
+ Citus cluster: demo ----------+--------------+---------+----+-----------+
| Group | Member | Host | Role | State | TL | Lag in MB |
+-------+---------+-------------+--------------+---------+----+-----------+
| 0 | coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| 0 | coord2 | 172.27.0.6 | Sync Standby | running | 1 | 0 |
| 0 | coord3 | 172.27.0.4 | Leader | running | 1 | |
| 1 | work1-1 | 172.27.0.8 | Sync Standby | running | 1 | 0 |
| 1 | work1-2 | 172.27.0.2 | Leader | running | 1 | |
| 2 | work2-1 | 172.27.0.5 | Sync Standby | running | 1 | 0 |
| 2 | work2-2 | 172.27.0.7 | Leader | running | 1 | |
+-------+---------+-------------+--------------+---------+----+-----------+
+ Citus cluster: demo ----------+----------------+---------+----+-----------+
| Group | Member | Host | Role | State | TL | Lag in MB |
+-------+---------+-------------+----------------+---------+----+-----------+
| 0 | coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| 0 | coord2 | 172.27.0.6 | Quorum Standby | running | 1 | 0 |
| 0 | coord3 | 172.27.0.4 | Leader | running | 1 | |
| 1 | work1-1 | 172.27.0.8 | Quorum Standby | running | 1 | 0 |
| 1 | work1-2 | 172.27.0.2 | Leader | running | 1 | |
| 2 | work2-1 | 172.27.0.5 | Quorum Standby | running | 1 | 0 |
| 2 | work2-2 | 172.27.0.7 | Leader | running | 1 | |
+-------+---------+-------------+----------------+---------+----+-----------+

If we add the ``--group`` option, the output will change to::

postgres@coord1:~$ patronictl list demo --group 0
+ Citus cluster: demo (group: 0, 7179854923829112860) -----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+-------------+--------------+---------+----+-----------+
| coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| coord2 | 172.27.0.6 | Sync Standby | running | 1 | 0 |
| coord3 | 172.27.0.4 | Leader | running | 1 | |
+--------+-------------+--------------+---------+----+-----------+
+ Citus cluster: demo (group: 0, 7179854923829112860) -+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+--------+-------------+----------------+---------+----+-----------+
| coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| coord2 | 172.27.0.6 | Quorum Standby | running | 1 | 0 |
| coord3 | 172.27.0.4 | Leader | running | 1 | |
+--------+-------------+----------------+---------+----+-----------+

postgres@coord1:~$ patronictl list demo --group 1
+ Citus cluster: demo (group: 1, 7179854923881963547) -----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+------------+--------------+---------+----+-----------+
| work1-1 | 172.27.0.8 | Sync Standby | running | 1 | 0 |
| work1-2 | 172.27.0.2 | Leader | running | 1 | |
+---------+------------+--------------+---------+----+-----------+
+ Citus cluster: demo (group: 1, 7179854923881963547) -+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+------------+----------------+---------+----+-----------+
| work1-1 | 172.27.0.8 | Quorum Standby | running | 1 | 0 |
| work1-2 | 172.27.0.2 | Leader | running | 1 | |
+---------+------------+----------------+---------+----+-----------+

Citus worker switchover
-----------------------
Expand All @@ -122,28 +124,28 @@ new primary worker node is ready to accept read-write queries.
An example of :ref:`patronictl_switchover` on the worker cluster::

postgres@coord1:~$ patronictl switchover demo
+ Citus cluster: demo ----------+--------------+---------+----+-----------+
| Group | Member | Host | Role | State | TL | Lag in MB |
+-------+---------+-------------+--------------+---------+----+-----------+
| 0 | coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| 0 | coord2 | 172.27.0.6 | Sync Standby | running | 1 | 0 |
| 0 | coord3 | 172.27.0.4 | Leader | running | 1 | |
| 1 | work1-1 | 172.27.0.8 | Leader | running | 1 | |
| 1 | work1-2 | 172.27.0.2 | Sync Standby | running | 1 | 0 |
| 2 | work2-1 | 172.27.0.5 | Sync Standby | running | 1 | 0 |
| 2 | work2-2 | 172.27.0.7 | Leader | running | 1 | |
+-------+---------+-------------+--------------+---------+----+-----------+
+ Citus cluster: demo ----------+----------------+---------+----+-----------+
| Group | Member | Host | Role | State | TL | Lag in MB |
+-------+---------+-------------+----------------+---------+----+-----------+
| 0 | coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| 0 | coord2 | 172.27.0.6 | Quorum Standby | running | 1 | 0 |
| 0 | coord3 | 172.27.0.4 | Leader | running | 1 | |
| 1 | work1-1 | 172.27.0.8 | Leader | running | 1 | |
| 1 | work1-2 | 172.27.0.2 | Quorum Standby | running | 1 | 0 |
| 2 | work2-1 | 172.27.0.5 | Quorum Standby | running | 1 | 0 |
| 2 | work2-2 | 172.27.0.7 | Leader | running | 1 | |
+-------+---------+-------------+----------------+---------+----+-----------+
Citus group: 2
Primary [work2-2]:
Candidate ['work2-1'] []:
When should the switchover take place (e.g. 2022-12-22T08:02 ) [now]:
Current cluster topology
+ Citus cluster: demo (group: 2, 7179854924063375386) -----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+------------+--------------+---------+----+-----------+
| work2-1 | 172.27.0.5 | Sync Standby | running | 1 | 0 |
| work2-2 | 172.27.0.7 | Leader | running | 1 | |
+---------+------------+--------------+---------+----+-----------+
+ Citus cluster: demo (group: 2, 7179854924063375386) -+-----------+
| Member | Host | Role | State | TL | Lag in MB |
+---------+------------+----------------+---------+----+-----------+
| work2-1 | 172.27.0.5 | Quorum Standby | running | 1 | 0 |
| work2-2 | 172.27.0.7 | Leader | running | 1 | |
+---------+------------+----------------+---------+----+-----------+
Are you sure you want to switchover cluster demo, demoting current primary work2-2? [y/N]: y
2022-12-22 07:02:40.33003 Successfully switched over to "work2-1"
+ Citus cluster: demo (group: 2, 7179854924063375386) ------+
Expand All @@ -154,17 +156,17 @@ An example of :ref:`patronictl_switchover` on the worker cluster::
+---------+------------+---------+---------+----+-----------+

postgres@coord1:~$ patronictl list demo
+ Citus cluster: demo ----------+--------------+---------+----+-----------+
| Group | Member | Host | Role | State | TL | Lag in MB |
+-------+---------+-------------+--------------+---------+----+-----------+
| 0 | coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| 0 | coord2 | 172.27.0.6 | Sync Standby | running | 1 | 0 |
| 0 | coord3 | 172.27.0.4 | Leader | running | 1 | |
| 1 | work1-1 | 172.27.0.8 | Leader | running | 1 | |
| 1 | work1-2 | 172.27.0.2 | Sync Standby | running | 1 | 0 |
| 2 | work2-1 | 172.27.0.5 | Leader | running | 2 | |
| 2 | work2-2 | 172.27.0.7 | Sync Standby | running | 2 | 0 |
+-------+---------+-------------+--------------+---------+----+-----------+
+ Citus cluster: demo ----------+----------------+---------+----+-----------+
| Group | Member | Host | Role | State | TL | Lag in MB |
+-------+---------+-------------+----------------+---------+----+-----------+
| 0 | coord1 | 172.27.0.10 | Replica | running | 1 | 0 |
| 0 | coord2 | 172.27.0.6 | Quorum Standby | running | 1 | 0 |
| 0 | coord3 | 172.27.0.4 | Leader | running | 1 | |
| 1 | work1-1 | 172.27.0.8 | Leader | running | 1 | |
| 1 | work1-2 | 172.27.0.2 | Quorum Standby | running | 1 | 0 |
| 2 | work2-1 | 172.27.0.5 | Leader | running | 2 | |
| 2 | work2-2 | 172.27.0.7 | Quorum Standby | running | 2 | 0 |
+-------+---------+-------------+----------------+---------+----+-----------+

And this is how it looks on the coordinator side::

Expand Down
3 changes: 2 additions & 1 deletion docs/dynamic_configuration.rst
Original file line number Diff line number Diff line change
Expand Up @@ -25,8 +25,9 @@ In order to change the dynamic configuration you can use either :ref:`patronictl
- **max\_timelines\_history**: maximum number of timeline history items kept in DCS. Default value: 0. When set to 0, it keeps the full history in DCS.
- **primary\_start\_timeout**: the amount of time a primary is allowed to recover from failures before failover is triggered (in seconds). Default is 300 seconds. When set to 0 failover is done immediately after a crash is detected if possible. When using asynchronous replication a failover can cause lost transactions. Worst case failover time for primary failure is: loop\_wait + primary\_start\_timeout + loop\_wait, unless primary\_start\_timeout is zero, in which case it's just loop\_wait. Set the value according to your durability/availability tradeoff.
- **primary\_stop\_timeout**: The number of seconds Patroni is allowed to wait when stopping Postgres and effective only when synchronous_mode is enabled. When set to > 0 and the synchronous_mode is enabled, Patroni sends SIGKILL to the postmaster if the stop operation is running for more than the value set by primary\_stop\_timeout. Set the value according to your durability/availability tradeoff. If the parameter is not set or set <= 0, primary\_stop\_timeout does not apply.
- **synchronous\_mode**: turns on synchronous replication mode. In this mode a replica will be chosen as synchronous and only the latest leader and synchronous replica are able to participate in leader election. Synchronous mode makes sure that successfully committed transactions will not be lost at failover, at the cost of losing availability for writes when Patroni cannot ensure transaction durability. See :ref:`replication modes documentation <replication_modes>` for details.
- **synchronous\_mode**: turns on synchronous replication mode. Possible values: ``off``, ``on``, ``quorum``. In this mode the leader takes care of management of ``synchronous_standby_names``, and only the last known leader, or one of synchronous replicas, are allowed to participate in leader race. Synchronous mode makes sure that successfully committed transactions will not be lost at failover, at the cost of losing availability for writes when Patroni cannot ensure transaction durability. See :ref:`replication modes documentation <replication_modes>` for details.
- **synchronous\_mode\_strict**: prevents disabling synchronous replication if no synchronous replicas are available, blocking all client writes to the primary. See :ref:`replication modes documentation <replication_modes>` for details.
- **synchronous\_node\_count**: if ``synchronous_mode`` is enabled, this parameter is used by Patroni to manage the precise number of synchronous standby instances and adjusts the state in DCS and the ``synchronous_standby_names`` parameter in PostgreSQL as members join and leave. If the parameter is set to a value higher than the number of eligible nodes, it will be automatically adjusted. Defaults to ``1``.
- **failsafe\_mode**: Enables :ref:`DCS Failsafe Mode <dcs_failsafe_mode>`. Defaults to `false`.
- **postgresql**:

Expand Down
Loading
Loading