One of our PostgreSQL cluster BARMAN backup was constantly getting "archiver errors: FAILED (duplicates: xxx)" warning via "barman check serverx" command. Also we saw following warnings in barman logfile;
2021-05-18 09:57:02,365 [13875] barman.wal_archiver INFO: Error: 00000001000000630000005B is already present in server serverx. File moved to errors directory.
We realized that our server admin was rebooted serverx without notifying us a week ago. So we did'nt properly shut down postgresql cluster.
While investigating the problem we noticed that; everytime serverx postgresql cluster switches a wal file, it tries to send every wal file under pg_wal directory. So duplicate wal files send to barman and they move to "errors" directory. This happens also manually executing switch wal command: "SELECT pg_switch_wal();"
Our BARMAN version:
$ barman --version
2.12
Barman by 2ndQuadrant (www.2ndQuadrant.com)
Our PostgreSQL cluster version:
postgres=# select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit
(1 row)
For fixing this issue we had to delete every serverx postgresql cluster backup on barman.
barman@barman:/barman/barman/backup$ rm -rf serverx/
The we setup serverx backup again;
barman@barman:/barman/barman/backup$ mkdir serverx
Manually switched wal file on serverx
SELECT pg_switch_wal();
The we saw barman created it's interal backup folders;
barman@barman:/barman/barman/backup/serverx$ ls
base errors incoming streaming wals
After some time, we executed barman check command;
barman@barman:/barman/barman/backup/serverx$ barman check serverx
Server serverx:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: FAILED (interval provided: 8 days, latest backup age: No available backups)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: FAILED (have 0 backups, expected at least 1)
ssh: OK (PostgreSQL server)
not in recovery: OK
systemid coherence: OK (no system Id stored on disk)
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: OK
Then we take a manuel backup;
barman@barman:/barman/barman/backup/serverx$ barman backup serverx
After that everyting was normal;
barman@barman:/barman/barman/backup/serverx$ barman check serverx
Server serverx:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: FAILED (interval provided: 8 days, latest backup age: No available backups)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: FAILED (have 0 backups, expected at least 1)
ssh: OK (PostgreSQL server)
not in recovery: OK
systemid coherence: OK (no system Id stored on disk)
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: OK
Hiç yorum yok:
Yorum Gönder