BARMAN archiver errors: FAILED (duplicates: xxx)

One of our PostgreSQL cluster BARMAN backup was constantly getting "archiver errors: FAILED (duplicates: xxx)" warning via "barman check serverx" command. Also we saw following warnings in barman logfile;

2021-05-18 09:57:02,365 [13875] barman.wal_archiver INFO: Error: 00000001000000630000005B is already present in server serverx. File moved to errors directory.

We realized that our server admin was rebooted serverx without notifying us a week ago. So we did'nt properly shut down postgresql cluster.

While investigating the problem we noticed that; everytime serverx postgresql cluster switches a wal file, it tries to send every wal file under pg_wal directory. So duplicate wal files send to barman and they move to "errors" directory. This happens also manually executing switch wal command: "SELECT pg_switch_wal();"

Our BARMAN version:

$ barman --version
2.12
Barman by 2ndQuadrant (www.2ndQuadrant.com)

Our PostgreSQL cluster version:

postgres=# select version();
version
---------------------------------------------------------------------------------------------------------
PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit
(1 row)

For fixing this issue we had to delete every serverx postgresql cluster backup on barman.

barman@barman:/barman/barman/backup$ rm -rf serverx/

The we setup serverx backup again;

barman@barman:/barman/barman/backup$ mkdir serverx

Manually switched wal file on serverx

SELECT pg_switch_wal();

The we saw barman created it's interal backup folders;

barman@barman:/barman/barman/backup/serverx$ ls
base errors incoming streaming wals

After some time, we executed barman check command;

barman@barman:/barman/barman/backup/serverx$ barman check serverx
Server serverx:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: FAILED (interval provided: 8 days, latest backup age: No available backups)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: FAILED (have 0 backups, expected at least 1)
ssh: OK (PostgreSQL server)
not in recovery: OK
systemid coherence: OK (no system Id stored on disk)
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: OK

Then we take a manuel backup;

barman@barman:/barman/barman/backup/serverx$ barman backup serverx

After that everyting was normal;

barman@barman:/barman/barman/backup/serverx$ barman check serverx
Server serverx:
PostgreSQL: OK
superuser or standard user with backup privileges: OK
wal_level: OK
directories: OK
retention policy settings: OK
backup maximum age: FAILED (interval provided: 8 days, latest backup age: No available backups)
compression settings: OK
failed backups: OK (there are 0 failed backups)
minimum redundancy requirements: FAILED (have 0 backups, expected at least 1)
ssh: OK (PostgreSQL server)
not in recovery: OK
systemid coherence: OK (no system Id stored on disk)
archive_mode: OK
archive_command: OK
continuous archiving: OK
archiver errors: OK

Bünyamin Balaban - Oracle Blog

18 Mayıs 2021 Salı

BARMAN archiver errors: FAILED (duplicates: xxx)

Hiç yorum yok:

Yorum Gönder