Everyone knows that stateful workloads like PostgreSQL cannot run in Kubernetes. Why do you say the contrary?
An independent research survey commissioned by the Data on Kubernetes Community in September 2021 revealed that half of the respondents run most of their production workloads on Kubernetes. 90% of them believe that Kubernetes is ready for stateful workloads, and 70% of them run databases in production. Databases like Postgres. However, according to them, significant challenges remain, such as the knowledge gap (Kubernetes and Cloud Native, in general, have a steep learning curve) and the quality of Kubernetes operators. The latter is the reason why we believe that an operator like CloudNativePG highly contributes to the success of your project.
For database fanatics like us, a real game-changer has been the introduction of the support for local persistent volumes in Kubernetes 1.14 in April 2019.
CloudNativePG is built on immutable application containers. What does it mean?
According to the microservice architectural pattern, a container is designed to run a single application or process. As a result, such container images are built to run the main application as the single entry point (the so-called PID 1 process).
In Kubernetes terms, the application is referred to as workload. Workloads can be stateless like a web application server or stateful like a database. Mapping this concept to PostgreSQL, an immutable application container is a single "postgres" process that is running and tied to a single and specific version - the one in the immutable container image.
No other processes such as SSH or systemd, or syslog are allowed.
Immutable Application Containers are in contrast with Mutable System Containers, which are still a very common way to interpret and use containers.
Immutable means that a container won't be modified during its life: no updates, no patches, no configuration changes. If you must update the application code or apply a patch, you build a new image and redeploy it. Immutability makes deployments safer and more repeatable.
For more information, please refer to "Why EDB chose immutable application containers".
What does Cloud Native mean?
The Cloud Native Computing Foundation defines the term "Cloud Native". However, since the start of the Cloud Native PostgreSQL/CloudNativePG operator at 2ndQuadrant, the development team has been interpreting Cloud Native as three main concepts:
- An existing, healthy, genuine, and prosperous DevOps culture, founded on people, as well as principles and processes, which enables teams and organizations (as teams of teams) to continuously change so to innovate and accelerate the delivery of outcomes and produce value for the business in safer, more efficient, and more engaging ways
- A microservice architecture that is based on Immutable Application Containers
- A way to manage and orchestrate these containers, such as Kubernetes
Currently, the standard de facto for container orchestration is Kubernetes, which automates the deployment, administration and scalability of Cloud Native Applications.
Another definition of Cloud Native that resonates with us is the one defined by Ibryam and Huß in "Kubernetes Patterns", published by O'Reilly:
Principles, Patterns, Tools to automate containerized microservices at scale
Why should I use an operator instead of running PostgreSQL as a container?
The most basic approach to running PostgreSQL in Kubernetes is to have a pod, which is the smallest unit of deployment in Kubernetes, running a Postgres container with no replica. The volume hosting the Postgres data directory is mounted on the pod, and it usually resides on network storage. In this case, Kubernetes restarts the pod in case of a problem or moves it to another Kubernetes node.
The most sophisticated approach is to run PostgreSQL using an operator. An operator is an extension of the Kubernetes controller and defines how a complex application works in business continuity contexts. The operator pattern is currently state of the art in Kubernetes for this purpose. An operator simulates the work of a human operator in an automated and programmatic way.
Postgres is a complex application, and an operator not only needs to deploy a cluster (the first step), but also properly react after unexpected events. The typical example is that of a failover.
An operator relies on Kubernetes for capabilities like self-healing, scalability, replication, high availability, backup, recovery, updates, access, resource control, storage management, and so on. It also facilitates the integration of a PostgreSQL cluster in the log management and monitoring infrastructure.
CloudNativePG enables the definition of the desired state of a PostgreSQL cluster via declarative configuration. Kubernetes continuously makes sure that the current state of the infrastructure matches the desired one through reconciliation loops initiated by the Kubernetes controller. If the desired state and the actual state don't match, reconciliation loops trigger self-healing procedures. That's where an operator like CloudNativePG comes into play.
You say that CloudNativePG is a fully declarative operator. What do you mean by that?
The easiest way is to explain declarative configuration through an example that highlights the differences with imperative configuration. In an imperative context, the state is defined as a series of tasks to be executed in sequence. So, we can get a three-node PostgreSQL cluster by creating the first instance, configuring the replication, cloning a second instance, and the third one.
In a declarative approach, the state of a system is defined using configuration, namely: there's a PostgreSQL 13 cluster with two replicas. This approach highly simplifies change management operations, and when these are stored in source control systems like Git, it enables the Infrastructure as Code capability. And Kubernetes takes it farther than deployment, as it makes sure that our request is fulfilled at any time.
What are the required skills to run PostgreSQL on Kubernetes?
Running PostgreSQL on Kubernetes requires both PostgreSQL and Kubernetes skills in your DevOps team. The best experience is when database administrators familiarize themselves with Kubernetes core concepts and are able to interact with Kubernetes administrators.
Our advice is for everyone that wants to fully exploit Cloud Native PostgreSQL to acquire the "Certified Kubernetes Administrator (CKA)" status from the CNCF certification program.
Why isn't CloudNativePG using StatefulSets?
CloudNativePG does not rely on StatefulSet
resources, and
instead manages the underlying PVCs directly by leveraging the selected
storage class for dynamic provisioning. Please refer to the
"Custom Pod Controller" section for details and reasons behind
this decision.
What happens to the PostgreSQL clusters when the operator pod dies or it is not available for a certain amount of time?
The CloudNativePG operator, among other things, is responsible for self-healing capabilities. As such, they might not be available during an outage of the operator.
However, assuming that the outage does not affect the nodes where PostgreSQL clusters are running, the database will continue to serve normal operations, through the relevant Kubernetes services. Moreover, the instance manager, which runs inside each PostgreSQL pod will still work, making sure that the database server is up, including accessory services like logging, export of metrics, continuous archiving of WAL files, etc.
To summarize:
an outage of the operator does not necessarily imply a PostgreSQL database outage; it's like running a database without a DBA or system administrator.
Why should I use PostgreSQL?
We believe that PostgreSQL is the equivalent in the database area of what Linux represents in the operating system space. The current latest major version of Postgres is version 14, which ships out of the box:
- native streaming replication, both physical and logical
- continuous hot backup and point in time recovery
- declarative partitioning for horizontal table partitioning, which is a very well-known technique in the database area to improve vertical scalability on a single instance
- extensibility, with extensions like PostGIS for geographical databases
- parallel queries for vertical scalability
- JSON support, unleashing the multi-model hybrid database for both structured and unstructured data queried via standard SQL
And so on ...
How many databases should be hosted in a single PostgreSQL instance?
Our recommendation is to dedicate a single PostgreSQL cluster (intended as primary and multiple standby servers) to a single database, entirely managed by a single microservice application. However, by leveraging the "postgres" superuser, it is possible to create as many users and databases as desired (subject to the available resources).
The reason for this recommendation lies in the Cloud Native concept, based on microservices. In a pure microservice architecture, the microservice itself should own the data it manages exclusively. These could be flat files, queues, key-value stores, or, in our case, a PostgreSQL relational database containing both structured and unstructured data. The general idea is that only the microservice can access the database, including schema management and migrations.
CloudNativePG has been designed to work this way out of the box, by default creating an application user and an application database owned by the aforementioned application user.
Reserving a PostgreSQL instance to a single microservice owned database, enhances:
- resource management: in PostgreSQL, CPU, and memory constrained resources are generally handled at the instance level, not the database level, making it easier to integrate it with Kubernetes resource management policies at the pod level
- physical continuous backup and Point-In-Time-Recovery (PITR): given that PostgreSQL handles continuous backup and recovery at the instance level, having one database per instance simplifies PITR operations, differentiates retention policy management, and increases data protection of backups
- application updates: enable each application to decide their update policies without impacting other databases owned by different applications
- database updates: each application can decide which PostgreSQL version to use, and independently, when to upgrade to a different major version of PostgreSQL and at what conditions (e.g., cutover time)
Is there an upper limit in database size for not considering Kubernetes?
No, as Kubernetes is no different from virtual machines and bare metal as far as this is regarded. Practically, however, it depends on the available resources of your Kubernetes cluster. Our advice with very large databases (VLDB) is to consider a shared nothing architecture, where a Kubernetes worker node is dedicated to a single Postgres instance, with dedicated storage. We proved that this extreme architectural pattern works when we benchmarked running PostgreSQL on bare metal Kubernetes with local persistent volumes. A current limitation of CloudNativePG, which will be overcome in future releases, is the lack of support for tablespaces so that horizontal partitioning can be easily implemented.
How can I specify a time zone in the PostgreSQL cluster?
PostgreSQL has an extensive support for time zones, as explained in the official documentation:
Although time zones can even be used at session, transaction and even as part
of a query in PostgreSQL, a very common way is to set them up globally. With
CloudNativePG you can configure the cluster level time zone in the
spec.postgresql.parameters
section as in the following example:
apiVersion: postgresql.cnpg.io/v1
kind: Cluster
metadata:
name: pg-italy
spec:
instances: 1
postgresql:
parameters:
timezone: "Europe/Rome"
storage:
size: 1Gi
The time zone can be verified with:
$ kubectl exec -ti pg-italy-1 -c postgres -- psql -x -c "SHOW timezone"
-[ RECORD 1 ]---------
TimeZone | Europe/Rome