Skip to content

Commit

Permalink
Merge pull request #6394 from EnterpriseDB/DOCS-291-comparison-matrix…
Browse files Browse the repository at this point in the history
…-for-pglogical2-vs-logical-repli

Docs 291 - Add PGD comparison matrix
  • Loading branch information
djw-m authored Jan 9, 2025
2 parents b302fa5 + b402854 commit f48fc7f
Show file tree
Hide file tree
Showing 5 changed files with 594 additions and 115 deletions.
6 changes: 5 additions & 1 deletion advocacy_docs/pg_extensions/pg_tuner/using.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,7 @@ Before a query is planned and executed, EDB Postgres Tuner will check the hash t
```bash
new_work_mem = ceil(max(1.75 * previous_sort_spill, 5.0 * previous_hash_spill))
```

- This calculation aims to allocate sufficient memory to avoid disk spills for the current execution.
- In-memory sorts and hash aggregates require more memory than when run with disk spills.
- While calculating the exact amount is non-trivial (if possible at all), the values of 1.75 and 5 have worked well in test cases.
Expand All @@ -133,7 +134,9 @@ new_work_mem = ceil(max(1.75 * previous_sort_spill, 5.0 * previous_hash_spill))

Because `work_mem` is allocated from a pool on a per-query basis, based on the highest disk spill previously seen for the query, it's possible for memory to be over allocated, because each query might use that amount of memory in each of multiple sort or hash aggregate nodes in the execution plan.
!!!

- The following example shows `work_mem` automatically bumped based on a disk spill for the previous query execution:

```sql
SET client_min_messages='debug2';
SET
Expand Down Expand Up @@ -220,7 +223,7 @@ rows=2333333 loops=3)
(11 rows)
```

## Logging
## Logging

For Postgres 14 and higher, you can use EDB Postgres Tuner to log query statistics.

Expand Down Expand Up @@ -297,3 +300,4 @@ ERROR: "edb_pg_tuner_global_stats" is not supported in this version
postgres=# SELECT * FROM edb_pg_tuner_query_stats();
ERROR: "edb_pg_tuner_query_stats" is not supported in this version
```
!!!
Original file line number Diff line number Diff line change
@@ -0,0 +1,53 @@
---
title: "PGD overview - architecture and performance"
navTitle: Architecture and performance
description: EDB Postgres Distributed architectural options and performance characteristics, including always-on architectures, supported Postgres database servers, and characteristics affecting performance
deepToC: true
redirects:
- bdr
---

## Architectural options and performance

### Always-on architectures

A number of different architectures can be configured, each of which has different performance and scalability characteristics.

The group is the basic building block consisting of 2+ nodes (servers). In a group, each node is in a different availability zone, with a dedicated router and backup, giving immediate switchover and high availability. Each group has a dedicated replication set defined on it. If the group loses a node, you can easily repair or replace it by copying an existing node from the group.

The Always-on architectures are built from either one group in a single location or two groups in two separate locations. Each group provides high availability. When two groups are leveraged in remote locations, they together also provide disaster recovery (DR).

Tables are created across both groups, so any change goes to all nodes, not just to nodes in the local group.

One node in each group is selected as the group's write leader. Proxies then direct application writes and queries to the write leader. The other nodes are replicas of the write leader. If, at any point, the write leader is seen to be unavailable, the remaining nodes in the group select a new write leader from the group the proxies direct traffic to that node. Scalability isn't the goal of this architecture.

Since writes are mainly to only one node, the possibility of contention between nodes is reduced to almost zero. As a result, performance impact is much reduced.

Secondary applications might execute against the shadow nodes, although these are reduced or interrupted if the main application begins using that node.

In the future, one node will be elected as the main replicator to other groups, limiting CPU overhead of replication as the cluster grows and minimizing the bandwidth to other groups.

### Supported Postgres database servers

PGD is compatible with [PostgreSQL](https://www.postgresql.org/), [EDB Postgres Extended Server](/pge/latest), and [EDB Postgres Advanced Server](/epas/latest) and is deployed as a standard Postgres extension named BDR. See [Compatibility](../#compatibility) for details about supported version combinations.

Some key PGD features depend on certain core capabilities being available in the target Postgres database server. Therefore, PGD users must also adopt the Postgres database server distribution that's best suited to their business needs. For example, if having the PGD feature Commit At Most Once (CAMO) is mission critical to your use case, don't adopt the community PostgreSQL distribution. It doesn't have the core capability required to handle CAMO. See the full feature matrix compatibility in [Choosing a Postgres distribution](../planning/choosing_server/).

PGD offers close-to-native Postgres compatibility. However, some access patterns don't necessarily work as well in multi-node setup as they do on a single instance. There are also some limitations in what you can safely replicate in a multi-node setting. [Application usage](../appusage) goes into detail about how PGD behaves from an application development perspective.

### Characteristics affecting performance

By default, PGD keeps one copy of each table on each node in the group, and any changes propagate to all nodes in the group.

Since copies of data are everywhere, SELECTs need only ever access the local node. On a read-only cluster, performance on any one node isn't affected by the number of nodes and is immune to replication conflicts on other nodes caused by long-running SELECT queries. Thus, adding nodes increases linearly the total possible SELECT throughput.

If an INSERT, UPDATE, and DELETE (DML) is performed locally, then the changes propagate to all nodes in the group. The overhead of DML apply is less than the original execution. So if you run a pure write workload on multiple nodes concurrently, a multi-node cluster can handle more TPS than a single node.

Conflict handling has a cost that acts to reduce the throughput. The throughput then depends on how much contention the application displays in practice. Applications with very low contention perform better than a single node. Applications with high contention can perform worse than a single node. These results are consistent with any multimaster technology and aren't particular to PGD.

Synchronous replication options can send changes concurrently to multiple nodes so that the replication lag is minimized. Adding more nodes means using more CPU for replication, so peak TPS reduces slightly as each node is added.

If the workload tries to use all CPU resources, then this resource constrains replication, which can then affect the replication lag.

In summary, adding more master nodes to a PGD group doesn't result in significant write throughput increase when most tables are replicated because all the writes are replayed on all nodes. Because PGD writes are in general more effective than writes coming from Postgres clients by way of SQL, you can increase performance. Read throughput generally scales linearly with the number of nodes.

64 changes: 64 additions & 0 deletions product_docs/docs/pgd/5.6/overview/basic-architecture.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,64 @@
---
title: "PGD Overview - PGD's basic architecture"
navTitle: Basic architecture
description: An overview of EDB Postgres Distributed's basic architecture, including groups, multiple masters, mesh topology, logical replication, connection management, and high availability
deepToC: true
redirects:
- bdr
---

EDB Postgres Distributed (PGD) provides multi-master replication and data distribution with advanced conflict management, data-loss protection, and [throughput up to 5X faster than native logical replication](https://www.enterprisedb.com/blog/performance-improvements-edb-postgres-distributed). It also enables distributed Postgres clusters with high availability up to five 9s.

PGD provides loosely coupled, multimaster logical replication using a mesh topology. This means that you can write to any server and the changes are sent directly, row-by-row, to all the other servers that are part of the same PGD group.

By default, PGD uses asynchronous replication, applying changes on the peer nodes only after the local commit. Multiple synchronous replication options are also available.

## Basic architecture

### Multiple groups

A PGD node is a member of at least one *node group*. In the most basic architecture, there's a single node group for the whole PGD cluster.

### Multiple masters

Each node (database) participating in a PGD group both receives changes from other members and can be written to directly by the user.

This is distinct from hot or warm standby, where only one master server accepts writes and all the other nodes are standbys that replicate either from the master or from another standby.

You don't have to write to all the masters all of the time. A frequent configuration directs writes mostly to just one master called the [write leader](../terminology/#write-leader).

### Asynchronous, by default

Changes made on one PGD node aren't replicated to other nodes until they're committed locally. As a result, the data isn't exactly the same on all nodes at any given time. Some nodes have data that hasn't yet arrived at other nodes. PostgreSQL's block-based replication solutions default to asynchronous replication as well. In PGD, there are multiple masters and, as a result, multiple data streams. So data on different nodes might differ even when `synchronous_commit` and `synchronous_standby_names` are used.

### Mesh topology

PGD is structured around a mesh network where every node connects to every other node, and all nodes exchange data directly with each other. There's no forwarding of data in PGD except in special circumstances, such as adding and removing nodes. Data can arrive from outside the EDB Postgres Distributed cluster or be sent onward using native PostgreSQL logical replication.

### Logical replication

Logical replication is a method of replicating data rows and their changes based on their replication identity (usually a primary key). We use the term logical in contrast to physical replication, which uses exact block addresses and byte-by-byte replication. Index changes aren't replicated, thereby avoiding write amplification and reducing bandwidth.

Logical replication starts by copying a snapshot of the data from the source node. Once that's done, later commits are sent to other nodes as they occur in real time. Changes are replicated without executing SQL again, so the exact data written is replicated quickly and accurately.

Nodes apply data in the order in which commits were made on the source node, ensuring transactional consistency is guaranteed for the changes from any single node. Changes from different nodes are applied independently of other nodes to ensure the rapid replication of changes.

Replicated data is sent in binary form when it's safe to do so.


### Connection management

[Connection management](../routing) leverages consensus-driven quorum to determine the correct connection endpoint in a semi-exclusive manner to prevent unintended multi-node writes from an application. This approach reduces the potential for data conflicts. The node selected as the correct connection endpoint at any point in time is referred to as the [write leader](../terminology/#write-leader).

[PGD Proxy](../routing/proxy) is the tool for application connection management provided as part of EDB Postgres Distributed.

### High availability

Each master node can be protected by one or more standby nodes, so any node that goes down can be quickly replaced and continue. Each standby node is a logical standby node.
(Postgres physical standbys aren't supported by PGD.)

Replication continues between currently connected nodes even if one or more nodes are currently unavailable. When the node recovers, replication can restart from where it left off without missing any changes.

Nodes can run different release levels, negotiating the required protocols to communicate. As a result, EDB Postgres Distributed clusters can use rolling upgrades, even for [major versions](../upgrades/upgrading_major_rolling/) of database software.

DDL is replicated across nodes by default. DDL execution can be user controlled to allow rolling application upgrades, if desired.
Loading

1 comment on commit f48fc7f

@github-actions
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please sign in to comment.