Syncing from upstream patroni/patroni (feature/quorum-commit) #430

bt-admin · 2024-07-31T01:06:26Z

bt_gitbot

This constant was imported in `postgresql/__init__.py` and used in the `can_advance_slots` property. But, after refactoring in #2958 we pass around a reference to `Postgresql` instead of `major_version` and therefore we can just rely on `can_advance_slots` property and don't reimplement its logic in other places.

Lets consider a following replication setup: ``` primary->standby1->standby2(replicatefrom: standby1) ``` In this case the `primary` will not create a physical replication slot for standby2, because it is streaming from the `standby1`. Things will look differently if we have the following dynamic configuration: ```yaml slots: primary: type: physical standby1: type: physical standby2: type: physical ``` In this case `primary` will also have `standby2` physical replication slot, which periodically must be advanced. So far it was working by taking value of `xlog_location` from the `/members/standby2` key in DCS. But, when DCS is down and failsafe mode is activate, the `standby2` physical slot on the `primary` will not not be moved, because there was not way to get the latest value of `xlog_location`. This PR is addressing the problem by making replica nodes to return their `xlog_location` as `lsn` header in the response on `POST /failsafe` REST API request. The current primary will use these values to advance replication slots for nodes with `replicatefrom` tag.

Pass the `Cluster` object instead of `Leader`. It will help to implement a new feature, "Configurable retention of replication slots for cluster members". Besides that fix a couple of issues with docstrings.

We forgot to update it in #3063

It could happen that there is "something" streaming from the current primary node with `application_name` that matches name of the current primary, for instance due to a faulty configuration. When processing `pg_stat_replication` we only checked that the `application_name` matches with the name one of the member nodes, but we forgot to exclude our own name. As a result there were following side-effects: 1. The current primary could be declared as a synchronous node. 2. As a result of [1] it wasn't possible to do a switchover. 3. During shutdown the current primary was waiting for itself to release it from synchronous nodes. Close #3111

If only the leader can't access DCS its member key will expire and `POST /failsafe` requests might be rejected because of that. Close #3096

…um-commit

CyberDem0n added 7 commits July 17, 2024 09:41

Refactor update_leader() method (#3107)

c633923

Pass the `Cluster` object instead of `Leader`. It will help to implement a new feature, "Configurable retention of replication slots for cluster members". Besides that fix a couple of issues with docstrings.

Patroni doesn't forece wal_log_hints anymore (#3109)

4456e26

We forgot to update it in #3063

Ignore restapi.allowlist_include_members for POST /failsafe (#3113)

ab9faf9

If only the leader can't access DCS its member key will expire and `POST /failsafe` requests might be rejected because of that. Close #3096

Merge branch 'master' of github.com:patroni/patroni into feature/quor…

4570b74

…um-commit

bt-admin added the feature/quorum-commit label Jul 31, 2024

bt-admin merged commit d3ef172 into brain-tec:feature/quorum-commit Jul 31, 2024
20 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Syncing from upstream patroni/patroni (feature/quorum-commit) #430

Syncing from upstream patroni/patroni (feature/quorum-commit) #430

bt-admin commented Jul 31, 2024

Syncing from upstream patroni/patroni (feature/quorum-commit) #430

Syncing from upstream patroni/patroni (feature/quorum-commit) #430

Conversation

bt-admin commented Jul 31, 2024