In today's world, many modern applications use PostgreSQL to store their data efficiently and securely. PostgreSQL is special because it's open-source, which means it's flexible and customizable, unlike traditional database systems like ORACLE and DB2.
However, to make PostgreSQL work really well and ensure it's always available, even in emergencies, you need to add some extra tools. Think of these tools as helping hands. In this article, we'll show you how to set up something called "PostgreSQL High Availability" using a open source tool called Patroni.
What is High Availability and why it is required?
High Availability, often abbreviated as HA, plays a critical role in ensuring uninterrupted operation of your applications relying on PostgreSQL. Picture this scenario: your application is up and running smoothly, but suddenly, the database server crashes or crucial database files are accidentally deleted. The result? Complete downtime for your application until the database is fully recovered.
To avoid such scenarios in production we need a setup with two or more in sync copies of the database and a mechanism to automatically switch from failed database to running database. This setup is referred as HA setup. The ultimate aim is to avoid single point of failure (SPOF) for the whole system.
Why to use Patroni?
There are different solutions available for setting up PostgreSQL with HA. You can find full list of replication solutions.
- PostgreSQL Automatic Failover(PAF)
- Replication Manager (Repmgr)
You can compare these solutions based on different factors like Failure detection, Failure recovery, Automatic Failover, Alert Management, Ease of Setup and Maintenance. Considering all these points Patroni is best candidate to choose for HA and DR setup. Also if you know little bit of Python you can easily read the code and change it according to your needs. Patroni also provides REST APIs to automate things on top of the existing functionalities.
Patroni is open source HA template for PostgreSQL written in Python which can be deployed easily on Kubernetes or VMs. It can be integrated with ETCD, Consul or Zookeeper as consensus store.
It is developed and maintained by Zalando, you can find source code on github.
Below figure shows architecture of complete Solution with all components. You can tweak this according to your needs and limitations on hardware. Here we have used 8 VMs to avoid SPOF and achieve High Availability on Postgres.
Software & Hardware used:
In the realm of PostgreSQL High Availability, Patroni relies on a set of key components and software tools to ensure seamless performance:
- Distributed Consensus Store (DCS): Patroni requires a DCS system, such as ETCD, Consul, or Zookeeper, to store vital configuration data and real-time status information of the nodes. We will use odd number (>1) of servers here we are using 3 nodes with minimum configuration.
- Load Balancer (e.g., HAProxy): A crucial element in the setup is a load balancer, like HAProxy. It plays a pivotal role in distributing incoming traffic across the PostgreSQL instances, ensuring all traffic should go to only master node. We will use two machines with minimum configuration - you can also utilize 1 HAProxy server but in this case we need to compromise on single point of failure.
- PostgreSQL Version 9.5 and Above: Patroni seamlessly integrates with PostgreSQL versions 9.5 and higher, providing advanced features and reliability enhancements. This compatibility ensures that you can leverage the latest capabilities of PostgreSQL while maintaining high availability. Hardware configuration for these nodes is dependent on the database size. For setting up you can start with 2 cores and 8GB RAM.
Deploying three PostgreSQL servers instead of two adds an extra layer of protection, safeguarding against multi-node failures and bolstering system reliability.
Hardware used in the Solution:
Number of VMs
For DCS you need to choose odd number of machines
Postgres + Patroni
VM Configuration above is given as a reference you can tweak this according to your needs.
Let's dive in by installing Production grade system. Below is the /etc/hosts file used in the setup.
ETCD Installation and Configuration (etcd1, etcd2, etcd3)
- Download installation tar for your architecture from https://github.com/etcd-io/etcd/releases/
- Once downloaded unzip and copy binaries to your /usr/bin
gtar –xvf etcd-v3.5.0-linux-amd64.tar.gz cd etcd-v3.5.0-linux-amd64/ cp etcd etcdctl etcdutl /usr/bin
- Make sure /usr/bin is added in PATH variable. You can run below commands to check if etcd is installed correctly.
[root@etcd1 ~]# etcd --version etcd Version: 3.5.0 Git SHA: 946a5a6f2 Go Version: go1.16.3 Go OS/Arch: linux/amd64
- Repeat same steps on all three ETCD hosts
Configure and run 3 node ETCD Cluster:
- Create etcd user and group for etcd binaries to run
groupadd --system etcd useradd -s /bin/bash --system -g etcd etcd
- Create two directories(data and configuration)
sudo mkdir -p /var/lib/etcd/ sudo mkdir /etc/etcd sudo chown -R etcd:etcd /var/lib/etcd/ /etc/etcd
- Login using etcd user and create .bash_profile file with below content
export ETCD_NAME=`hostname -s` export ETCD_HOST_IP=`hostname -i`
- Create Service etcd in /etc/systemd/system/etcd.service, replace IP addresses with your corresponding machine IPs
- Once Service created enable the service and start it on all three servers
sudo systemctl daemon-reload sudo systemctl enable etcd sudo systemctl start etcd
- You can check cluster working by issuing following commands:
[root@etcd1 ~]# etcdctl member list --write-out=table +------------------+---------+-------+----------------------------+----------------------------+------------+ | ID | STATUS | NAME | PEER ADDRS | CLIENT ADDRS | IS LEARNER | +------------------+---------+-------+----------------------------+----------------------------+------------+ | a79bc423e9248ab8 | started | etcd2 | http://192.168.56.202:2380 | http://192.168.56.202:2379 | false | | b6820e5fa6807659 | started | etcd3 | http://192.168.56.203:2380 | http://192.168.56.203:2379 | false | | b82998c4f9249433 | started | etcd1 | http://192.168.56.201:2380 | http://192.168.56.201:2379 | false | +------------------+---------+-------+----------------------------+----------------------------+------------+
- To check leader you can check endpoint status:
etcdctl endpoint status --write-out=table --endpoints=etcd1:2379,etcd2:2379,etcd3:2379
Note: By default etcd does not support v2 API, in case patroni fails to start with the api error, add --enable-v2 flag in etcd service
Patroni and Postgres Installation(pgdb1, pgdb2, pgdb3):
- Below script can be used to install Postgres on Debian System (Ubuntu) - this will install latest postgresql version for Debian
sudo sh -c 'echo "deb https://apt.postgresql.org/pub/repos/apt $(lsb_release -cs)-pgdg main" > /etc/apt/sources.list.d/pgdg.list' wget --quiet -O - https://www.postgresql.org/media/keys/ACCC4CF8.asc | sudo apt-key add - sudo apt-get update sudo apt-get -y install postgresql
- Install dependencies required for Patroni to work
- Install Patroni service
dnf install patroni
- You need to install extra package required for connecting to etcd
pip3 install python-etcd
- Enable Patroni service
systemctl enable patroni
- Create configuration file and required directories for patroni:
mkdir -p /etc/patroni/logs/ #directory to store logs chmod 777 /etc/patroni/logs touch /etc/patroni/patroni.yml
- Create config file for patroni as below (/etc/patroni/patroni.yml)
- Start Patroni
service patroni start
- Repeat same procedure on all three nodes, for any issues you can set log.level and log.traceback_level to DEBUG
- Once all nodes are up and running you can check status of patroni cluster using patronictl utility.
patronictl -c /etc/patroni/patroni.yml list + Cluster: bootvar_cluster (6974438395379920074) --+-----------+ | Member | Host | Role | State | TL | Lag in MB | +--------+----------------+---------+---------+----+-----------+ | pgdb1 | 192.168.56.204 | Leader | running | 10 | | | pgdb2 | 192.168.56.205 | Replica | running | 10 | 0 | | pgdb3 | 192.168.56.206 | Replica | running | 10 | 0 | +--------+----------------+---------+---------+----+-----------+
If you are not familiar with patronictl, check our guide on patroni commands
Now patroni cluster is ready to use, you can start playing around and do some replication and failover tests.
After this we need to setup load balancer to point it to active (Leader) Postgres database. For this you need two HAProxy servers or if you are setting this on cloud you can use load balancers provided by cloud provider.
Install load balancer (HAProxy - haproxy1, haproxy2):
- Install HAProxy on both servers:
apt install haproxy
- Configure haproxy.cfg file to redirect all traffic to active postgres leader.
Note: Haproxy will check 8008 port of pgdb servers and if it returns 200 status then it will redirect all traffic to the leader. This 8008 port is configured in Patroni.
- Start haproxy on both nodes
service haproxy start
- Once haproxy is started you can check status by hitting url http://haproxy1:7000
You can see all connections on haproxy:5432 will be redirected to pgdb1:5432, you can check if pgdb1 is the leader or not.
Now try connecting to the cluster using haproxy host, it should get redirected to leader.
Now you can run over some failover tests and handover it to application team.
Application side Configuration:
As we have two HAProxy servers application should be configured in such a way that it should point to both servers, submit the request to available server and if application does not support such case then you need to set up virtual IP which will point to available HAProxy server.
You can handover below details to application team:
Postgres read-only applications such as reporting, analysis, dashboards can use standby postgres nodes. To configure such type of application you need to create HAProxy listener on different port.
Things to consider when using Patroni:
- You need to administrate PostgreSQL through Patroni like changing DB parameters, Starting or Stopping of the database.
- Sometimes you need to write your own automation for having out of the box functionalities.
- Patroni is open source library and does not come with enterprise support, you need to depend on open source community for any unforeseen issues or bugs. Although there are managed services available for Patroni.