5 min read

Introduction to PostgreSQL Logical Replication: A Beginner's Guide

Learn everything you need to know about PostgreSQL logical replication, including how it works, its benefits, and when to use it. A perfect guide for beginners to get started with database replication.

PostgreSQL logical replication is a powerful feature that allows you to replicate data at a finer level of control compared to physical replication. Unlike physical replication, which copies the entire database cluster, logical replication enables replication of specific tables, offering flexibility and efficiency for various use cases like real-time analytics and data warehousing.

In this guide, we’ll cover the fundamentals of logical replication, how it differs from physical replication, and how to set it up step-by-step.

Table of contents

What is Logical Replication?

Logical replication is a method of streaming data changes between PostgreSQL databases at the table level. It uses a publish-subscribe model:

  • Publisher: The source database that sends the changes.
  • Subscriber: The destination database that applies the changes.

With logical replication, you can replicate only the required data instead of the entire database, making it ideal for heterogeneous replication or consolidating data from multiple sources.

Logical Replication vs Physical Replication

FeatureLogical ReplicationPhysical Replication
Replication LevelTable-specific replicationEntire database cluster
Use CasePartial data replication, analyticsHigh Availability, disaster recovery
FlexibilityHigh (supports custom setups)Limited
Cross-VersionSupportedNot supported

When to Use Logical Replication

  1. Data Analytics: Replicate specific tables for real-time reporting.
  2. Data Distribution: Share data with multiple subscribers or replicas.
  3. Heterogeneous Replication: Sync data between different PostgreSQL versions.
  4. Selective Replication: Target specific tables for replication.
Step-by-Step Guide: Configuring PostgreSQL HA with Patroni
Learn to configure PostgreSQL HA with Patroni, ETCD, and HAProxy - detailed step-by-step guide.

Before Setting up Replication

Before enabling logical replication in PostgreSQL, it’s essential to prepare your environment to handle the changes effectively. Logical replication requires certain configurations that can impact system performance and storage. This section outlines the key considerations and steps you should take before setting up replication.

  1. Monitor and Plan for Increased WAL Usage

Logical replication relies on Write-Ahead Logging (WAL) to track and propagate changes. When you set wal_level to logical, the volume of WAL logs increases, potentially consuming significant disk space.

  1. Evaluate Network Bandwidth

Logical replication streams data changes over the network between the publisher and subscriber. Insufficient bandwidth may cause delays in replication and increase latency.

  1. Verify Resource Availability

Logical replication may increase CPU and memory usage on both the publisher and subscriber due to the additional processes handling replication.

Securing Your PostgreSQL Database with SSL: A Step-by-Step Guide
SSL is required for a secure connection between the client and the server so confidential data cannot be intercepted by any middle-man

Setting Up Logical Replication

This section provides a step-by-step guide to configure logical replication, including commands and parameter settings to ensure a seamless setup.

Step 1: Enable Logical Replication

Logical replication requires specific parameters to be configured in the postgresql.conf file.

  1. Open the PostgreSQL configuration file:
sudo nano /var/lib/pgsql/data/postgresql.conf
  1. Modify or add the following parameters:
wal_level = logical
max_replication_slots = 4
max_wal_senders = 4
  1. Restart PostgreSQL to apply changes:
sudo systemctl restart postgresql

Step 2: Configure the Publisher

  1. Allow replication connections in pg_hba.conf:
    Open the pg_hba.conf file:
sudo nano /var/lib/pgsql/data/pg_hba.conf

Add an entry for the subscriber:

host replication replicator 192.168.1.10/32 md5
  1. Create a replication role:
CREATE ROLE replicator WITH REPLICATION PASSWORD 'password' LOGIN;
  1. Create a publication:
CREATE PUBLICATION my_publication FOR TABLE employees, orders;

To set replication for all tables:

CREATE PUBLICATION my_publication FOR ALL TABLES;

Step 3: Export Database Structure

We can use pg_dump to export the structure of the tables included in the publication:

pg_dump -h <publisher_host> -U <publisher_user> -d <publisher_db> -t employees -t orders --schema-only -f table_structure.sql
  • Replace <publisher_host>, <publisher_user>, and <publisher_db> with the publisher's connection details.
  • -t employees -t orders: Specify the tables to export.
  • --schema-only: Exports only the table structure, not the data.
  • -f table_structure.sql: Outputs the structure to a file.

Step 4: Import structure in subscriber

Use pg_restore or psql to import the table structure into the subscriber database:

psql -h <subscriber_host> -U <subscriber_user> -d <subscriber_db> -f table_structure.sql

Replace <subscriber_host>, <subscriber_user>, and <subscriber_db> with the subscriber's connection details.

Step 5: Configure the Subscriber

  1. Create table structure for all the tables needed, preferred method for creating these table is to have pg_dump and pg_restore for the structure
  2. Create a subscription on the subscriber database:
CREATE SUBSCRIPTION my_subscription
CONNECTION 'host=192.168.1.20 port=5432 dbname=mydb user=replicator password=password'
PUBLICATION my_publication;
  • my_subscription: Name of the subscription.
  • CONNECTION: Connection string to the publisher.
  • PUBLICATION my_publication: Name of the publication to subscribe to.
  1. Verify the subscription:
SELECT * FROM pg_stat_subscription;

Step 6: Verify Logical Replication

Insert sample data into the employees table on the publisher:

INSERT INTO employees (id, name, department) VALUES (1, 'Alice', 'HR');

Check the data on the subscriber database:

SELECT * FROM employees;

The data should appear in the subscriber database.

Important Queries for Managing Logical Replication

Below queries are important when you want to manage or check current status of the replication in the database.

  1. List Publications:
SELECT * FROM pg_publication;
  1. List Subscriptions:
SELECT * FROM pg_subscription;
  1. Monitor Replication Slots:
SELECT * FROM pg_replication_slots;
E-Book: Patroni Administration & Commands
Master Patroni Commands and administration using this completely free ebook!

Patroni E-Book

Cleanup Replication

After you are done with the replication testing or if you do not want to keep streaming the replication you can use below commands to cleanup it

  1. Drop a Subscription:
DROP SUBSCRIPTION my_subscription;
  1. Drop a Publication:
DROP PUBLICATION my_publication;

Best Practices for Logical Replication

  1. Monitor replication lag and adjust max_wal_senders as needed.
  2. Ensure subscriber tables are initialized before creating a subscription.

Conclusion

Logical replication in PostgreSQL is a versatile tool that gives you granular control over data replication. Whether you're distributing data across regions or setting up real-time analytics, it provides a flexible and efficient solution. With this guide, you're now ready to set up and manage logical replication in your PostgreSQL environment.

For more tutorials and insights into PostgreSQL, stay tuned to our blog!

References:

19.6. Replication
19.6.&nbsp;Replication # 19.6.1. Sending Servers 19.6.2. Primary Server 19.6.3. Standby Servers 19.6.4. Subscribers These settings control the behavior of the …
Logical Replication in PostgreSQL Explained
This post discusses logical replication in PostgreSQL. It reviews the differences between physical or binary replication and logical or transactional replication.