Reputation: 6584
What is wrong with the recover command that isn't transferring these WAL files?
barman recover --target-time "2017-05-16 16:39:02.235780+00:00" \
--remote-ssh-command "ssh [email protected]" \
main-db latest /var/lib/postgresql/9.4/main
Here is my process...
The main database is shutdown (simulated failure), but there has been a recent backup and WAL files have been shipped up from the main-db server.
barman@ip-172-30-2-77:~/main-db$ barman check main-db
Server main-db:
PostgreSQL: FAILED
directories: OK
retention policy settings: OK
backup maximum age: OK (interval provided: 1 day, latest backup age: 1 hour, 3 minutes, 46 seconds)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: OK (have 1 backups, expected at least 0)
ssh: OK (PostgreSQL server)
not in recovery: OK
archiver errors: OK
On the Barman server, we can see that there are 6 WAL files archived since the last
barman backup main-db
run.
barman@ip-172-30-2-77:~/main-db$ ls -lah
total 22M
drwxrwxr-x 2 barman barman 4.0K May 16 17:08 .
drwxrwxr-x 3 barman barman 4.0K May 16 17:08 ..
-rw------- 1 barman barman 28K May 16 16:39 0000000100000001000000E2
-rw------- 1 barman barman 204 May 16 16:39 0000000100000001000000E2.00000090.backup
-rw------- 1 barman barman 84K May 16 16:44 0000000100000001000000E3
-rw------- 1 barman barman 37K May 16 16:49 0000000100000001000000E4
-rw------- 1 barman barman 30K May 16 16:54 0000000100000001000000E5
-rw------- 1 barman barman 8.9M May 16 16:58 0000000100000001000000E6
-rw------- 1 barman barman 9.1M May 16 16:59 0000000100000001000000E7
-rw------- 1 barman barman 2.6M May 16 17:04 0000000100000001000000E8
-rw------- 1 barman barman 543K May 16 17:07 0000000100000001000000E9
Now I will run the recover command to restore the standby database server by using the latest backup + WAL files, as follows:
barman@ip-172-30-2-77:~/main-db$ barman list-server
main-db - Main DB Server
standby-db - Standby DB Server
barman@ip-172-30-2-77:~/main-db$ barman list-backup main-db
main-db 20170516T163617 - Tue May 16 16:39:02 2017 - Size: 4.0 GiB - WAL Size: 21.1 MiB
barman@ip-172-30-2-77:~/main-db$ barman show-backup main-db 20170516T163617
Backup 20170516T163617:
Server Name : main-db
Status : DONE
PostgreSQL Version : 90411
PGDATA directory : /var/lib/postgresql/9.4/main
Base backup information:
Disk usage : 4.0 GiB (4.0 GiB with WALs)
Incremental size : 4.0 GiB (-0.00%)
Timeline : 1
Begin WAL : 0000000100000001000000E2
End WAL : 0000000100000001000000E2
WAL number : 1
WAL compression ratio: 99.83%
Begin time : 2017-05-16 16:36:17.369993+00:00
End time : 2017-05-16 16:39:02.235780+00:00
Begin Offset : 144
End Offset : 4912
Begin XLOG : 1/E2000090
End XLOG : 1/E2001330
WAL information:
No of files : 7
Disk usage : 21.1 MiB
WAL rate : 16.91/hour
Compression ratio : 81.21%
Last available : 0000000100000001000000E9
Catalog information:
Retention Policy : VALID
Previous Backup : - (this is the oldest base backup)
Next Backup : - (this is the latest base backup)
barman@ip-172-30-2-77:~/main-db$ barman recover --target-time "2017-05-16 16:39:02.235780+00:00" \
--remote-ssh-command "ssh [email protected]" \
main-db latest /var/lib/postgresql/9.4/main
Starting remote restore for server main-db using backup 20170516T163617
Destination directory: /var/lib/postgresql/9.4/main/
Doing PITR. Recovery target time: '2017-05-16 16:39:02.235780+00:00'
Copying the base backup.
Copying required WAL segments.
Generating recovery.conf
Your PostgreSQL server has been successfully prepared for recovery!
Now focusing on the Postgresql data directory (/var/lib/postgresql/9.4/main) on the standby database server.
postgres@ip-172-30-0-66:~/9.4/main$ pwd
/var/lib/postgresql/9.4/main
postgres@ip-172-30-0-66:~/9.4/main$ ls
backup_label pg_hba.conf pg_replslot pg_tblspc postgresql.conf
barman_xlog pg_ident.conf pg_serial pg_twophase postgresql.conf.origin
base pg_log pg_snapshots PG_VERSION recovery.conf
global pg_logical pg_stat pg_xlog
pg_clog pg_multixact pg_stat_tmp postgresql.auto.conf
pg_dynshmem pg_notify pg_subtrans postgresql.auto.conf.origin
postgres@ip-172-30-0-66:~/9.4/main$ ls barman_xlog/
0000000100000001000000E2 0000000100000001000000E2.00000090.backup
We can see that none of the below WAL files were transferred with the recover command.
However, I can pull them with the barman-cli's barman-restore-wal
command. So this tells me that they are definelty available on the barman server. Here is the recovery.conf file I used to restore WAL files.
root@ip-172-30-0-66:/var/lib/postgresql/9.4/maincat recovery.conf
The 'barman-wal-restore' command is provided in the 'barman-cli' package
standby_mode = 'on'
trigger_file = '/var/lib/postgresql/9.4/trigger'
restore_command = 'barman-wal-restore 52.51.36.41 main-db %f %p'
Now we can see that all the WAL files were pulled from the barman server.
root@ip-172-30-0-66:/var/lib/postgresql/9.4/mainls -lah pg_xlog/
total 129M
drwx------ 3 postgres postgres 4.0K May 16 17:59 .
drwx------ 19 postgres postgres 4.0K May 16 17:58 ..
-rw------- 1 postgres postgres 16M May 16 17:56 0000000100000001000000E2
-rw-rw-r-- 1 postgres postgres 324 May 16 17:54 0000000100000001000000E2.00000090.backup
-rw------- 1 postgres postgres 16M May 16 17:56 0000000100000001000000E3
-rw------- 1 postgres postgres 16M May 16 17:56 0000000100000001000000E4
-rw------- 1 postgres postgres 16M May 16 17:56 0000000100000001000000E5
-rw------- 1 postgres postgres 16M May 16 17:56 0000000100000001000000E6
-rw------- 1 postgres postgres 16M May 16 17:56 0000000100000001000000E7
-rw------- 1 postgres postgres 16M May 16 17:56 0000000100000001000000E8
-rw------- 1 postgres postgres 16M May 16 17:56 0000000100000001000000E9
drwxrwxr-x 2 postgres postgres 4.0K May 16 17:56 archive_status
-rw------- 1 postgres postgres 0 May 16 17:59 RECOVERYXLOG
Upvotes: 2
Views: 1800
Reputation: 31
Your --target-time should be "2017-05-16 17:08" (time of last Wal file received from your ls command ) , not the end time of the latest backup in the recover command to recover the wal files after the backup has been made. Target time is subsequent to the end time of the backup. If you want to recover any time between the a backup's start and end time, you should use the previous backup. Barman only recovers to the time you specify in the recover command.
Upvotes: 3