aidanmelen
aidanmelen

Reputation: 6584

Barman recover command not copying "all" WAL files to standby database server from on Barman server

What is wrong with the recover command that isn't transferring these WAL files?

barman recover --target-time "2017-05-16 16:39:02.235780+00:00" \
--remote-ssh-command "ssh [email protected]" \
main-db latest /var/lib/postgresql/9.4/main

Here is my process...

The main database is shutdown (simulated failure), but there has been a recent backup and WAL files have been shipped up from the main-db server.

barman@ip-172-30-2-77:~/main-db$ barman check main-db
Server main-db:
    PostgreSQL: FAILED
    directories: OK
    retention policy settings: OK
    backup maximum age: OK (interval provided: 1 day, latest backup age: 1 hour, 3 minutes, 46 seconds)
    compression settings: OK
    failed backups: OK (there are 0 failed backups)
    minimum redundancy requirements: OK (have 1 backups, expected at least 0)
    ssh: OK (PostgreSQL server)
    not in recovery: OK
    archiver errors: OK

On the Barman server, we can see that there are 6 WAL files archived since the last barman backup main-db run.

barman@ip-172-30-2-77:~/main-db$ ls -lah
total 22M
drwxrwxr-x 2 barman barman 4.0K May 16 17:08 .
drwxrwxr-x 3 barman barman 4.0K May 16 17:08 ..
-rw------- 1 barman barman  28K May 16 16:39 0000000100000001000000E2
-rw------- 1 barman barman  204 May 16 16:39 0000000100000001000000E2.00000090.backup
-rw------- 1 barman barman  84K May 16 16:44 0000000100000001000000E3
-rw------- 1 barman barman  37K May 16 16:49 0000000100000001000000E4
-rw------- 1 barman barman  30K May 16 16:54 0000000100000001000000E5
-rw------- 1 barman barman 8.9M May 16 16:58 0000000100000001000000E6
-rw------- 1 barman barman 9.1M May 16 16:59 0000000100000001000000E7
-rw------- 1 barman barman 2.6M May 16 17:04 0000000100000001000000E8
-rw------- 1 barman barman 543K May 16 17:07 0000000100000001000000E9

Now I will run the recover command to restore the standby database server by using the latest backup + WAL files, as follows:

barman@ip-172-30-2-77:~/main-db$ barman list-server
main-db - Main DB Server
standby-db - Standby DB Server

barman@ip-172-30-2-77:~/main-db$ barman list-backup main-db
main-db 20170516T163617 - Tue May 16 16:39:02 2017 - Size: 4.0 GiB - WAL Size: 21.1 MiB

barman@ip-172-30-2-77:~/main-db$ barman show-backup main-db 20170516T163617
Backup 20170516T163617:
  Server Name            : main-db
  Status                 : DONE
  PostgreSQL Version     : 90411
  PGDATA directory       : /var/lib/postgresql/9.4/main

  Base backup information:
    Disk usage           : 4.0 GiB (4.0 GiB with WALs)
    Incremental size     : 4.0 GiB (-0.00%)
    Timeline             : 1
    Begin WAL            : 0000000100000001000000E2
    End WAL              : 0000000100000001000000E2
    WAL number           : 1
    WAL compression ratio: 99.83%
    Begin time           : 2017-05-16 16:36:17.369993+00:00
    End time             : 2017-05-16 16:39:02.235780+00:00
    Begin Offset         : 144
    End Offset           : 4912
    Begin XLOG           : 1/E2000090
    End XLOG             : 1/E2001330

  WAL information:
    No of files          : 7
    Disk usage           : 21.1 MiB
    WAL rate             : 16.91/hour
    Compression ratio    : 81.21%
    Last available       : 0000000100000001000000E9

  Catalog information:
    Retention Policy     : VALID
    Previous Backup      : - (this is the oldest base backup)
    Next Backup          : - (this is the latest base backup)

barman@ip-172-30-2-77:~/main-db$ barman recover --target-time "2017-05-16 16:39:02.235780+00:00" \
--remote-ssh-command "ssh [email protected]" \
main-db latest /var/lib/postgresql/9.4/main

Starting remote restore for server main-db using backup 20170516T163617
Destination directory: /var/lib/postgresql/9.4/main/
Doing PITR. Recovery target time: '2017-05-16 16:39:02.235780+00:00'
Copying the base backup.
Copying required WAL segments.
Generating recovery.conf
Your PostgreSQL server has been successfully prepared for recovery!

Now focusing on the Postgresql data directory (/var/lib/postgresql/9.4/main) on the standby database server.

postgres@ip-172-30-0-66:~/9.4/main$ pwd
/var/lib/postgresql/9.4/main
postgres@ip-172-30-0-66:~/9.4/main$ ls
backup_label  pg_hba.conf    pg_replslot   pg_tblspc            postgresql.conf
barman_xlog   pg_ident.conf  pg_serial     pg_twophase          postgresql.conf.origin
base          pg_log         pg_snapshots  PG_VERSION           recovery.conf
global        pg_logical     pg_stat       pg_xlog
pg_clog       pg_multixact   pg_stat_tmp   postgresql.auto.conf
pg_dynshmem   pg_notify      pg_subtrans   postgresql.auto.conf.origin
postgres@ip-172-30-0-66:~/9.4/main$ ls barman_xlog/
0000000100000001000000E2  0000000100000001000000E2.00000090.backup

We can see that none of the below WAL files were transferred with the recover command.

However, I can pull them with the barman-cli's barman-restore-wal command. So this tells me that they are definelty available on the barman server. Here is the recovery.conf file I used to restore WAL files.

root@ip-172-30-0-66:/var/lib/postgresql/9.4/maincat recovery.conf
The 'barman-wal-restore' command is provided in the 'barman-cli' package
standby_mode = 'on'
trigger_file = '/var/lib/postgresql/9.4/trigger'
restore_command = 'barman-wal-restore 52.51.36.41 main-db %f %p'

Now we can see that all the WAL files were pulled from the barman server.

root@ip-172-30-0-66:/var/lib/postgresql/9.4/mainls -lah pg_xlog/
total 129M
drwx------  3 postgres postgres 4.0K May 16 17:59 .
drwx------ 19 postgres postgres 4.0K May 16 17:58 ..
-rw-------  1 postgres postgres  16M May 16 17:56 0000000100000001000000E2
-rw-rw-r--  1 postgres postgres  324 May 16 17:54 0000000100000001000000E2.00000090.backup
-rw-------  1 postgres postgres  16M May 16 17:56 0000000100000001000000E3
-rw-------  1 postgres postgres  16M May 16 17:56 0000000100000001000000E4
-rw-------  1 postgres postgres  16M May 16 17:56 0000000100000001000000E5
-rw-------  1 postgres postgres  16M May 16 17:56 0000000100000001000000E6
-rw-------  1 postgres postgres  16M May 16 17:56 0000000100000001000000E7
-rw-------  1 postgres postgres  16M May 16 17:56 0000000100000001000000E8
-rw-------  1 postgres postgres  16M May 16 17:56 0000000100000001000000E9
drwxrwxr-x  2 postgres postgres 4.0K May 16 17:56 archive_status
-rw-------  1 postgres postgres    0 May 16 17:59 RECOVERYXLOG

Upvotes: 2

Views: 1800

Answers (1)

Chirag Sharma
Chirag Sharma

Reputation: 31

Your --target-time should be "2017-05-16 17:08" (time of last Wal file received from your ls command ) , not the end time of the latest backup in the recover command to recover the wal files after the backup has been made. Target time is subsequent to the end time of the backup. If you want to recover any time between the a backup's start and end time, you should use the previous backup. Barman only recovers to the time you specify in the recover command.

Upvotes: 3

Related Questions