Understanding Streaming Replication in PostgreSQL

Introduction:
PostgreSQL, an advanced open-source database management system, offers various replication features to ensure data availability and redundancy. One of its key replication techniques is streaming replication, which allows for synchronous data replication between a primary server and multiple standby servers. In this blog post, we will delve into the inner workings of streaming replication in PostgreSQL, exploring topics such as its startup process, data transfer mechanism, management of multiple standbys, and failure detection. Let’s dive in!

Streaming Replication: An Overview
Streaming replication in PostgreSQL is a native replication feature introduced in version 9.1. It operates on a single-master-multi-slaves architecture, where the primary server (also known as the master) continuously sends Write-Ahead Log (WAL) data to the standby servers (also known as the slaves). The standbys then replay the received data, maintaining synchronization with the primary.

Synchronous vs. Asynchronous Replication

Prior to version 9.1, PostgreSQL supported only asynchronous replication. However, the introduction of streaming replication provided a more robust and reliable solution for synchronous replication, ensuring that data changes are replicated to standby servers in real-time. Asynchronous replication is still available but has been superseded by the newer synchronous replication implementation.

Starting the Streaming Replication
To understand how streaming replication works, let’s explore the startup sequence and connection establishment between primary and standby servers. The following steps outline the process:

Processes Involved
In streaming replication, three processes collaborate to facilitate data transfer: the walsender process on the primary server, the walreceiver process, and the startup process on the standby server. These processes communicate using a single TCP connection.

Startup Sequence
The startup sequence of streaming replication unfolds as follows:

SR startup sequence

  1. Start primary and standby servers.
  2. The standby server initiates the startup process.
  3. The standby server starts the walreceiver process.
  4. The walreceiver sends a connection request to the primary server, periodically retrying if the primary server is not running.
  5. Upon receiving the connection request, the primary server starts the walsender process, establishing a TCP connection with the walreceiver.
  6. The walreceiver sends the latest Log Sequence Number (LSN) of the standby’s database cluster, initiating a handshaking phase.
  7. If the standby’s LSN is behind the primary’s LSN, the walsender sends the corresponding WAL data from the primary’s WAL segment directory (pg_xlog or pg_wal) to the standby. This catch-up phase ensures synchronization between the primary and the standby.
  8. Streaming replication begins, and the walsender process keeps a state corresponding to the working phase (start-up, catch-up, or streaming).

Communication Between a Primary and a Synchronous Standby
In streaming replication, communication between a primary server and a synchronous standby server plays a vital role. Let’s examine the process of data transmission during a transaction commit:

Include Figure 11.2: Streaming Replication's communication sequence diagram

  1. The backend process on the primary server writes and flushes WAL data to a WAL segment file.
  2. The walsender process sends the written WAL data to the walreceiver process.
  3. The backend process waits for an ACK response from the standby server, acquiring a latch using SyncRepWaitForLSN().
  4. The walreceiver writes the received WAL data into the standby’s WAL segment and sends an ACK response to the walsender.
  5. The walreceiver flushes the WAL data to the segment and notifies the startup process of the updated WAL data.
  6. The startup process replays the written WAL data.
  7. The walsender releases the latch of the backend process upon receiving the ACK response, completing the commit or abort action.

During this process, the primary server receives ACK responses from the walreceiver, which contain information such as the latest written, flushed, and replayed LSNs, along with the timestamp of the response. These responses help the primary server monitor the status of all connected standby servers.

4. Ensuring Standby Server Recovery
What happens if a standby server restarts after a prolonged period in a stopped condition? The behavior depends on the PostgreSQL version:

Version 9.3 or Earlier
In older versions, if the primary’s WAL segments required by the standby have already been recycled, the standby cannot catch up with the primary server. To mitigate this issue, one could set a large value for the wal_keep_segments configuration parameter, reducing the likelihood of occurrence. However, this solution is merely a stopgap measure.

Version 9.4 or Later
Starting from version 9.4, PostgreSQL introduced replication slots to address this problem. Replication slots enhance the flexibility of WAL data sending, particularly for logical replication. They allow unsent WAL segment files to be preserved in the replication slot, pausing the recycling process. For more details, refer to the official documentation.

Handling Failures and Managing Multiple Standby Servers

In this section, we will explore how the primary server behaves in the event of a failure in the synchronous standby server and how multiple standby servers are managed in streaming replication.

Dealing with Synchronous Standby Failure
When a synchronous standby server fails and is unable to return an ACK response, the primary server continues to wait for responses indefinitely. As a result, running transactions cannot commit, and query processing comes to a halt. In other words, all primary server operations practically stop. Streaming replication does not provide an automatic fallback to asynchronous mode through timeout.

To avoid this situation, there are two approaches. One is to use multiple standby servers to enhance system availability. The other is to manually switch from synchronous to asynchronous mode by following these steps:

  1. Set an empty string for the parameter synchronous_standby_names.
   synchronous_standby_names = ''
  1. Execute the pg_ctl command with the reload option.
   postgres> pg_ctl -D $PGDATA reload

The above procedure does not impact connected clients. The primary server continues transaction processing, and all sessions between clients and the respective backend processes are maintained.

Managing Multiple Standby Servers
In streaming replication, the primary server assigns sync_priority and sync_state values to all managed standby servers, treating each standby server based on these values.

  • sync_priority: It indicates the priority of the standby server in synchronous mode and is a fixed value. Smaller values indicate higher priority, with 0 representing “asynchronous mode.” The priorities of standby servers are assigned in the order listed in the primary’s configuration parameter synchronous_standby_names. For example, in the configuration synchronous_standby_names = 'standby1, standby2', the priorities of standby1 and standby2 are 1 and 2, respectively. Standby servers not listed in this parameter operate in asynchronous mode with a priority of 0.
  • sync_state: It represents the state of the standby server. The state varies depending on the running status of all standby servers and their individual priority. The possible states are as follows:
    • Sync: The state of the synchronous standby server with the highest priority among all working standbys (excluding asynchronous servers).
    • Potential: The state of a spare synchronous standby server with the second or lower priority among all working standbys (excluding asynchronous servers). If the synchronous standby fails, it will be replaced with the highest priority standby among the potential ones.
    • Async: The state of an asynchronous standby server, which remains fixed. The primary server treats asynchronous standbys similarly to potential standbys, except that their sync_state is never “Sync” or “Potential.”

To view the priority and state of the standby servers, you can execute the following query:

testdb=# SELECT application_name AS host, sync_priority, sync_state FROM pg_stat_replication;
   host   | sync_priority | sync_state
----------+---------------+------------
 standby1 |             1 | Sync
 standby2 |             2 | Potential
(2 rows)

In the above example, standby1 has a sync priority of 1 and is in the “Sync” state, while standby2 has a sync priority of 2 and is in the “Potential” state.

Primary Server’s Management of Multiple Standbys
The primary server only waits for ACK responses from the synchronous standby server. It confirms the writing and flushing of WAL data by the synchronous standby alone. Thus, streaming replication ensures that only the synchronous standby remains in a consistent and synchronous state with the primary.

Figure 11.3 illustrates a scenario where the ACK response from the potential standby is received earlier than that from the primary standby. In this case, the primary’s backend process continues to wait for an ACK response from the synchronous standby. Once the primary’s response is received, the backend process releases the latch and completes the current transaction processing.

Fig. 11.3: Managing multiple standby servers

In the opposite scenario, where the primary’s ACK response is received earlier than the potential standby’s response, the primary server immediately completes the commit action of the current transaction without ensuring whether the potential standby writes and flushes WAL data.

Handling Standby Server Failures and Failure Detection

In this section, we will examine the behavior of the primary server when a standby server fails and explore the methods used to detect failures in streaming replication.

Behavior When a Standby Server Fails
The primary server’s behavior differs based on the type of standby server failure:

  • Potential or Asynchronous Standby Failure:
    If a potential or asynchronous standby server fails, the primary server terminates the walsender process connected to the failed standby and continues all processing. In other words, the transaction processing of the primary server remains unaffected by the failure of these standby servers.
  • Synchronous Standby Failure:
    When a synchronous standby server fails, the primary server terminates the walsender process connected to the failed standby and replaces it with the highest priority potential standby. Refer to Figure 11.4 for visualization. Unlike the previous scenario, query processing on the primary server will be paused from the point of failure until the replacement of the synchronous standby. Failure detection of standby servers plays a crucial role in increasing the availability of the replication system.

Fig. 11.4: Replacing synchronous standby server

In any case, if one or more standby servers are configured to run in synchronous mode, the primary server always maintains only one synchronous standby server. This synchronous standby server remains in a consistent and synchronous state with the primary at all times.

Detecting Standby Server Failures
Streaming replication employs two common procedures for detecting failures without requiring specialized hardware:

  • Failure Detection of Standby Server Process:
    The primary server immediately identifies a faulty standby server or walreceiver process when a connection drop between the walsender and walreceiver is detected. If a low-level network function returns an error due to a failure in writing or reading the socket interface of the walreceiver, the primary server promptly determines the failure.
  • Failure Detection of Hardware and Networks:
    If a walreceiver does not return any response within the duration set by the wal_sender_timeout parameter (default 60 seconds), the primary server considers the standby server as faulty. However, in contrast to the previous failure scenario, it may take some time (up to wal_sender_timeout seconds) for the primary server to confirm the standby’s failure. This delay occurs when a standby server is unable to send any response due to various failures such as hardware issues or network problems.

Depending on the nature of the failures, some may be immediately detected while others may experience a time lag between the occurrence of the failure and its detection. It’s important to note that if a synchronous standby server encounters the latter type of failure, all transaction processing on the primary server will be halted until the failure is detected, even if multiple potential standby servers are operational.

Conclusion
In this blog post, we explored how the primary server behaves during different types of standby server failures in streaming replication. Additionally, we learned about the methods employed for detecting failures in the replication system. By understanding these aspects, you can effectively handle standby server failures and ensure the availability and reliability of your PostgreSQL replication setup. In the next blog post, we will delve into more advanced topics related to streaming replication. Stay tuned!

References:

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these <abbr title="HyperText Markup Language">HTML</abbr> tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>