PostgreSQL Point-in-time Recovery (Incremental Backup)

PostgreSQL “Point-in-time Recovery” (PITR) also called as incremental database backup , online backup or may be archive backup. At all times, PostgreSQL maintains a write ahead log (WAL) in the pg_wal/ subdirectory of the cluster’s data directory. The log records every change made to the database’s data files. This log exists primarily for crash-safety purposes: if the system crashes, the database can be restored to consistency by “replaying” the log entries made since the last checkpoint. However, the existence of the log makes it possible to use a third strategy for backing up databases: we can combine a file-system-level backup with backup of the WAL files. If recovery is needed, we restore the file system backup and then replay from the backed-up WAL files to bring the system to a current state.postgresqlPITR

Prerequisites

This assumes that hot-backups are enabled in PostgreSQL. Ensure the following are setup in postgresql.conf on the box where the backups are located

# create directory for archive logs
mkdir /pg_log_archive
chown -R postgres:postgres /pg_log_archive/

We’ll create a simple directory and make sure it’s owned by ‘postgres’ user. Once this is done, we can start editing the PostgreSQL configuration.

Setting up WAL archiving

PostgreSQL stores all modifications in form of Write-Ahead Logs (WAL). These files can be used for replication, but they can also be used for PITR.

wal_level = replica
archive_mode = on
archive_command = 'test ! -f /pg_log_archive/%f && cp %p /pg_log_archive/%f'
max_wal_senders = 5

The settings above have to be configured – you want to enable archiving (archive_mode = on), you want also to set wal_level to replica. The wal_level determines how much information is written to the WAL. The default value is replica, which writes enough data to support WAL archiving and replication, including running read-only queries on a standby server.In releases prior to 9.6, this parameter also allowed the values archive and hot_standby. These are still accepted but mapped to replica.

Once we are done with those changes, we need to restart PostgreSQL.

At this point, you should see WAL files appearing in your target directory. Please keep in mind that the process is triggered after a WAL file rotation – if you have very low traffic, it may take a while before WAL archiving will be started.

Create a database with some data.

pic1

For force archiving the logs use below command

psql -c "select pg_switch_wal();     # pg_switch_xlog(); for versions < 10

pic2

pg_wal is a PostgreSQL log file folder that use to store all data history records.Every log file can handle around 16M data, when it excess this limit, it will automatically create a new log file, filename is follow 0-9 and A-Z

Summary of PostgreSQL Backup Steps

  1. Modify postgresql.conf to support archive log
  2. Make a base backup (full database backup)
  3. Backup base backup to remote storage.
  4. Backup WAL (archive log files) to remote storage (continuous process)

Summary of PostgreSQL Point-in-time Recovery Steps

  1. Extract files from base backup
  2. Copy files from pg_wal folder
  3. Create recovery.conf file
  4. Start Recover

Create dummy tables and records

Table testPITR2 created at 19:17:16

pic3.JPG

Create a full Database backup – base backup

The next step will be to take a base backup that we’ll use as a base for our point-in-time restore. You can take it using pg_basebackup:

# pg_basebackup  -l "Backup created at $(hostname) on $(date)" --progress  -Ft  -D /postgresbackup/

For more information on pg_basebackup click here

pic4

Refer the backup_label to get information on base backup

pic5.JPG

-bash-4.2$ pwd
/var/lib/pgsql/10/data/pg_wal
-bash-4.2$ ls
000000010000000000000018                  00000001000000000000001D  000000010000000000000023  000000010000000000000029  00000001000000000000002F
000000010000000000000019                  00000001000000000000001E  000000010000000000000024  00000001000000000000002A  000000010000000000000030
00000001000000000000001A                  00000001000000000000001F  000000010000000000000025  00000001000000000000002B  000000010000000000000031
00000001000000000000001B                  000000010000000000000020  000000010000000000000026  00000001000000000000002C  000000010000000000000032
00000001000000000000001C                  000000010000000000000021  000000010000000000000027  00000001000000000000002D  archive_status
00000001000000000000001C.00000028.backup  000000010000000000000022  000000010000000000000028  00000001000000000000002E
-bash-4.2$ cat 00000001000000000000001C.00000028.backup
START WAL LOCATION: 0/1C000028 (file 00000001000000000000001C)
STOP WAL LOCATION: 0/1C000130 (file 00000001000000000000001C)
CHECKPOINT LOCATION: 0/1C000060
BACKUP METHOD: streamed
BACKUP FROM: master
START TIME: 2018-11-26 19:29:08 UTC
LABEL: Backup created at ip-172-31-32-96.ec2.internal on Mon Nov 26 19:29:07 UTC 2018
STOP TIME: 2018-11-26 19:29:13 UTC

 

 Table testPITR3 & testPITR4 created at 19:43:51  & 19:44:49 

pic7

Disaster Stimulation

pic8.JPG

Recovery Process

Following are the backups and Wal file data we require.

a) Base backup file – pgdatabk20081125.tar
b) Log files hanv’t archive yet – all files under Pg_wal folder
c) WALs – all archive files under wals folder (may be a remote storage in real environment)

  1. Rename /var/lib/pgsql/10/data to /var/lib/pgsql/10/data.old .Assume database file in data folder was damaged due to disaster we created just now, we need to create a fresh database later.
  2. Unzip / extract files from base.tar and pg_wal.tar, create a new data directory under /var/lib/pgsql/10 . Move all extracted files from from base.tar and pg_wal.tar /var/lib/pgsql/10/data and /var/lib/pgsql/10/data/pg_wal respectively
# pwd
/var/lib/pgsql/10
# mkdir data
# chown -R postgres:postgres data
# chmod 700 data/
# tar xvf /postgresbackup/base.tar -C /var/lib/pgsql/10/data/
# tar xvf /postgresbackup/pg_wal.tar -C /var/lib/pgsql/10/data/pg_wal/

Start the database Instance

Table testPITR2 created at 19:17:16 is restored

pic9

3. Copy log files from pg_wal folder. Some log files still located in /var/lib/pgsql/10/data.old pg_wal folder (those log files haven’t archive yet) during disaster happening, we need to copy the log file back and recover it as much as possible.

# cp /var/lib/pgsql/10/data.old/pg_wal/0* /var/lib/pgsql/10/data/pg_wal/

First of all, you need to figure out the point at which you should restore your data. Ideally, you’ll be able to give an exact time. Other options are: a named restore point (pg_create_restore_point()), transaction ID or WAL location LSN. You can try to use the pg_waldump utility to parse WAL files and print them in human-readable format, but it’s actually far from being human-readable so it can still be tricky to pinpoint the exact location.

4.  Create a recovery.conf file and put it under /var/lib/pgsql/10/data

Now create a recovery command file recovery.conf in the data directory. You might also want to temporarily modify pg_hba.conf to prevent ordinary users from connecting until you are sure the recovery was successful.

Sample recovery.conf file can be found at /usr/pgsql-10/share/recovery.conf.sample

# systemctl stop postgresql-10.service
#  vi /var/lib/pgsql/10/data/recovery.conf

Create following content in recovery.conf

restore_command = 'cp /pg_log_archive/%f %p'
recovery_target_time = '2018-11-26 19:44:49'

recovery_target_time is the time we need to recover to. Omit this setting will make PostgreSQL recover as much as it can, it may recover all changes.

Make sure that this config file permissions are properly set

#  chown postgres:postgres /var/lib/pgsql/10/data/recovery.conf

Also Make sure that all files under /var/lib/pgsql/10/data/pg_wal/ have correct permissions/ownership of postgres user and group.

In the example above, we defined a restore command (simple cp from our /pg_log_archive directory into PostgreSQL pg_wal directory). We also should decide where to stop – we decided to use recovery_target_time. We can also use recovery_target_lsn , recovery_target_name & recovery_target_xid .

Above recovery.conf file will make PostgreSQL take the archive log files from /pg_log_archive/ folder and recover the data changes until 2018-11-26 19:44:49 (table testPITR4created).

5.  Start database and verify the log file and data

pic11

If server is in recovery mode. You can switch it to normal operations by running:

# SELECT pg_wal_replay_resume();

You can also define in recovery.conf how PostgreSQL should behave after the recovery using:

recovery_target_action

There are three options:

  • ‘pause’, the default, keeps the server in recovery mode. It is useful if you want to verify that everything is indeed in place before you allow normal traffic to hit the server and modify data.
  • ‘promote’ will end the recovery mode as soon as the recovery itself completes.
  • ‘shutdown’ will stop the PostgreSQL server after the recovery.

This is one time process, after recovery process started and finished, we cant make any recovery changes (like roll forward to another time).If we want to roll forward to another restore time, we need to start whole recovery process again, like extract files from base backup and copy log files.

After recovery process finished, recovery.conf will rename to recovery.done by PostgreSQL to avoid start the recovery process again.

pic10.jpg

After recovery is complete, recovery.conf file will be renamed to recovery.done. Please keep in mind that this won’t happen with:

recovery_target_action = ‘shutdown’

In this case you need to remove or rename the file manually, otherwise the server will keep restarting over and over again.

If restore is successful and everything looks good, allow your users to connect by restoring pg_hba.conf to normal.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s

Powered by WordPress.com.

Up ↑

%d bloggers like this: