Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docs 824 arch overview #6405

Open
wants to merge 12 commits into
base: develop
Choose a base branch
from
99 changes: 68 additions & 31 deletions product_docs/docs/pgd/5.6/overview/basic-architecture.mdx
Original file line number Diff line number Diff line change
@@ -1,64 +1,101 @@
---
title: "PGD Overview - PGD's basic architecture"
navTitle: Basic architecture
description: An overview of EDB Postgres Distributed's basic architecture, including groups, multiple masters, mesh topology, logical replication, connection management, and high availability.
title: "Architectural overview"
description: An overview of EDB Postgres Distributed architecture
deepToC: true
redirects:
- bdr
---

EDB Postgres Distributed (PGD) provides multi-master replication and data distribution with advanced conflict management, data-loss protection, and [throughput up to 5X faster than native logical replication](https://www.enterprisedb.com/blog/performance-improvements-edb-postgres-distributed). It also enables distributed Postgres clusters with high availability up to five 9s.
<div class="container mt-4 mb-4 border border-primary">
<div class="row bg-light">
<div class="col-12 text-center px-3 py-3">
Read about why PostgreSQL is better when it’s distributed with EDB Postgres Distributed in <a href="https://www.enterprisedb.com/distributed-postgresql-always-on-database-availability">Distributed PostgreSQL:The Key to Always On Database Availability</a>
</div>
</div>
<div class="row bg-light">
<div class="col-6 px-3 py-3 d-flex justify-content-center align-items-center">
<a href="https://www.enterprisedb.com/products/edb-postgres-distributed" class="btn btn-lg btn-primary text-light px-4 text-nowrap text-center" title="Get a free trial of PGD">PGD Free Trial</a>
</div>
<div class="col-6 px-3 py-3 d-flex justify-content-center align-items-center">
<a href="https://www.enterprisedb.com/contact" class="btn btn-lg btn-primary text-light px-4 text-nowrap text-center" title="Contact sales with any questions">Contact Sales</a>
</div>
</div>
</div>

PGD provides loosely coupled, multimaster logical replication using a mesh topology. This means that you can write to any server and the changes are sent directly, row by row, to all the other servers that are part of the same PGD group.

By default, PGD uses asynchronous replication, applying changes on the peer nodes only after the local commit. Multiple synchronous replication options are also available.
EDB Postgres Distributed (PGD) is a distributed database solution that extends PostgreSQL's capabilities, enabling highly available and fault-tolerant database deployments across multiple nodes.
PGD provides data distribution with advanced conflict management, data-loss protection, high availability up to five 9’s, and throughput up to 5X faster than native logical replication.

## Basic architecture
PGD is built on a multi-master foundation (Bi-directional replication, or BDR) which is then optimized for performance and availability through PGD Proxy.
PGD proxy ensures lower contention and conflict through the use of a write leader, and for each proxy instance a single endpoint automatically addresses all the data nodes in a group, removing the need for clients to round robin multi-host connection strings.
[Raft](https://en.wikipedia.org/wiki/Raft_(algorithm)) is implemented to help the system make important decisions, like deciding which node is the Raft election leader and which node is the write leader.

### Multiple groups
## High-level architecture

A PGD node is a member of at least one *node group*. In the most basic architecture, there's a single node group for the whole PGD cluster.
At the highest level, PGD comprises two main components: Bi-Directional Replication (BDR) and PGD-proxy.
BDR is a Postgres extension that enables a multi-master replication mesh between different BDR-enabled Postgres instances/nodes.
[PGD proxy](../routing) sends requests to the write leader—ensuring a lower risk of conflicts between nodes.

### Multiple masters
![Diagram showing 3 application nodes, 3 proxy instances, and 3 PGD nodes. Traffic is being directed from each of the proxy instances to the write leader node.](./img/always_on_1x3_updated.png)

Each node (database) participating in a PGD group both receives changes from other members and can be written to directly by the user.
Changes are replicated directly, row-by-row between all nodes.
[Logical replication](../terminology/#logical-replication) in PGD is asynchronous by default, so only eventual consistency is guaranteed (within seconds usually).
However, [commit scope](../commit-scopes/commit-scopes) options offer immedidate consistency and durability guarantees via [CAMO](/pgd/latest/commit-scopes/camo/), [group](../commit-scopes/group-commit) and [synchronous](../commit-scopes/synchronous_commit) commits.

This is distinct from hot or warm standby, where only one master server accepts writes and all the other nodes are standbys that replicate either from the master or from another standby.
The Raft algorithm provides a mechanism for [electing](../routing/raft/04_raft_elections_in_depth/) leaders (both Raft leader and write leader), deciding which nodes should be added or subtracted from the cluster, and generally ensuring that the distributed system remains consistent and fault-tolerant, even in the face of node failures.

You don't have to write to all the masters all of the time. A frequent configuration directs writes mostly to just one master called the [write leader](../terminology/#write-leader).
## Architectural elements

### Asynchronous, by default
PGD comprises several key architectural elements that work together to provide its distributed database solution:

Changes made on one PGD node aren't replicated to other nodes until they're committed locally. As a result, the data isn't exactly the same on all nodes at any given time. Some nodes have data that hasn't yet arrived at other nodes. PostgreSQL's block-based replication solutions default to asynchronous replication as well. In PGD, there are multiple masters and, as a result, multiple data streams. So data on different nodes might differ even when `synchronous_commit` and `synchronous_standby_names` are used.
- **PGD nodes**: These are individual Postgres instances that store and manage data. They are the basic building blocks of a PGD cluster.

- **Groups**: PGD nodes are organized into [groups](../node_management/groups_and_subgroups), which enhance manageability and high availability. Each group can contain multiple nodes, allowing for redundancy and failover within the group. Groups facilitate organized replication and data consistency among nodes within the same group and across different groups. Each group has its own write leader.

### Mesh topology
- **Replication mechanisms**: PGD's replication mechanisms include Bi-Directional Replication (BDR) for efficient replication across nodes, enabling multi-master replication. BDR supports asynchronous replication by default, but can be configured for varying levels of synchronicity, such as [Group Commit](../commit-scopes/group-commit) or [Synchronous Commit](../commit-scopes/synchronous_commit), to enhance data durability.

PGD is structured around a mesh network where every node connects to every other node, and all nodes exchange data directly with each other. There's no forwarding of data in PGD except in special circumstances, such as adding and removing nodes. Data can arrive from outside the EDB Postgres Distributed cluster or be sent onward using native PostgreSQL logical replication.
- **Monitoring tools**: To monitor performance, health, and usage with PGD, you can utilize its [built-in command-line interface](../cli) (CLI), which offers several useful commands. For instance, the `pgd show-nodes` command provides a summary of all nodes in the cluster, including their state and status. The `pgd check-health` command checks the health of the cluster, reporting on node accessibility, replication slot health, and other critical metrics. The `pgd show-events` command lists significant events like background worker errors and node membership changes, which helps in tracking the operational status and issues within the cluster. Furthermore, the BDR extension allows for monitoring your cluster using SQL using the [`bdr.monitor`](../security/pgd-predefined-roles/#bdr_monitor) role.

### Logical replication
### Node types

Logical replication is a method of replicating data rows and their changes based on their replication identity (usually a primary key). We use the term *logical* in contrast to *physical* replication, which uses exact block addresses and byte-by-byte replication. Index changes aren't replicated, thereby avoiding write amplification and reducing bandwidth.
All nodes in PGD are effectively data nodes. They vary only in their purpose in the cluster.

Logical replication starts by copying a snapshot of the data from the source node. Once that's done, later commits are sent to other nodes as they occur in real time. Changes are replicated without executing SQL again, so the exact data written is replicated quickly and accurately.
- **[Data nodes](../nodes/#data-nodes)**: Store and manage data, handle read and write operations, and participate in replication.

Check warning on line 63 in product_docs/docs/pgd/5.6/overview/basic-architecture.mdx

View workflow job for this annotation

GitHub Actions / check-links

slugCheck

cannot find slug for #data-nodes in product_docs/docs/pgd/5.6/nodes/index.mdx

Nodes apply data in the order in which commits were made on the source node, ensuring transactional consistency is guaranteed for the changes from any single node. Changes from different nodes are applied independently of other nodes to ensure the rapid replication of changes.
There are then three types of node which, although built like a data node, have a specific purpose. These are:

Replicated data is sent in binary form when it's safe to do so.
- **[Subscriber-only nodes](../nodes/subscriber_only/#subscriber-only-nodes)**: Subscribe to changes from data nodes for read-only purposes, used in reporting or analytics.

Check warning on line 67 in product_docs/docs/pgd/5.6/overview/basic-architecture.mdx

View workflow job for this annotation

GitHub Actions / check-links

slugCheck

cannot find slug for #subscriber-only-nodes in product_docs/docs/pgd/5.6/nodes/subscriber_only/index.mdx

- **[Witness nodes](../nodes/witness_nodes/)**: Participate in the consensus process without storing data, aiding in achieving quorum and maintaining high availability.

### Connection management
- **[Logical standby nodes](../nodes/logical_standby_nodes/)**: Act as standby nodes that can be promoted to data nodes if needed, ensuring high availability and disaster recovery.

[Connection management](../routing) leverages consensus-driven quorum to determine the correct connection endpoint in a semi-exclusive manner to prevent unintended multi-node writes from an application. This approach reduces the potential for data conflicts. The node selected as the correct connection endpoint at any point in time is referred to as the [write leader](../terminology/#write-leader).
### Node roles

[PGD Proxy](../routing/proxy) is the tool for application connection management provided as part of EDB Postgres Distributed.
Data nodes in a group can also take on particular roles to enable particular features.
These roles are transient and can be transferred to any other capable node in the group if needed.
These roles can include:

### High availability
- **Raft leader**: Arbitrates and manages consensus between a group's nodes.

Each master node can be protected by one or more standby nodes, so any node that goes down can be quickly replaced and continue. Each standby node is a logical standby node.
(Postgres physical standbys aren't supported by PGD.)
- **[Write leader](../terminology/#write-leader)**: Receives all write operations from PGD Proxy.

Replication continues between currently connected nodes even if one or more nodes are currently unavailable. When the node recovers, replication can restart from where it left off without missing any changes.
## Architectural Flexibility

Nodes can run different release levels, negotiating the required protocols to communicate. As a result, EDB Postgres Distributed clusters can use rolling upgrades, even for [major versions](../upgrades/upgrading_major_rolling/) of database software.
Postgres Distributed (PGD) offers flexible options with how its architecture can be deployed, maintained, and scaled to meet various performance, availability, and compliance needs.

DDL is replicated across nodes by default. If you want, you can control DDL execution to allow rolling application upgrades.
PGD supports rolling maintenance, including blue/green deployments for both Postgres upgrades and other system or application-level changes. This ensures that the database remains available during routine tasks such as minor or major version upgrades, schema changes, and vacuuming operations. The system seamlessly switches between active database versions, achieving zero-downtime.

PGD provides automatic failover to ensure high availability. If a node in the cluster becomes unavailable, another node automatically takes over its responsibilities, minimizing downtime. Additionally, PGD includes self-healing capabilities, where nodes that have failed or disconnected can automatically reconnect to the cluster and resume normal operations once the issue is resolved.

PGD allows for selective replication, enabling users to replicate only a subset of data to specific nodes. This feature can be used to optimize performance by reducing unnecessary data traffic between nodes or to meet regulatory requirements, such as geographical data restrictions. For instance, a healthcare application might only replicate patient data within a specific region to comply with local data privacy laws.

With commit scopes, PGD also provides configurable durability. Accordingly, durability can be increased from the default asynchronous behavior and tuned using various configurable commit scopes:

- **[Synchronous Commit](../commit-scopes/synchronous_commit.mdx)**: Works a lot like PostgreSQL’s synchronous_commit option in its underlying operation—requires writing to at least one other node at COMMIT time, but can be tuned to require all nodes.

- **[CAMO](../commit-scopes/camo.mdx)** (Commit at most once): Works by tracking each transaction with a unique ID and using a pair of nodes to confirm the transaction's outcome, ensuring the application knows whether to retry the transaction or not.

- **[Group Commit](../commit-scopes/group-commit.mdx)**: An experimental commit scope, the goal of which is to protect against data loss in case of single node failures of temporary outages by requiring more than one PGD node to successfully confirm a transaction at COMMIT time.

- **[Lag Control](../commit-scopes/lag-control.mdx)**: If replication is running outside of set limits (taking too long for another node to be replicated to), a delay is injected into the node that originally received the transaction, slowing things down until other nodes have caught up.
2 changes: 1 addition & 1 deletion product_docs/docs/pgd/5.6/overview/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,7 +12,7 @@ navigation:

EDB Postgres Distributed (PGD) provides multi-master replication and data distribution with advanced conflict management, data-loss protection, and [throughput up to 5X faster than native logical replication](https://www.enterprisedb.com/blog/performance-improvements-edb-postgres-distributed). It also enables distributed Postgres clusters with high availability up to five 9s.

* [Basic architecture](basic-architecture)
* [Architectural overview](basic-architecture)
* [Architectural options and performance](architecture-and-performance)
* [Comparison with other replication solutions](compared)

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading