18 Mayıs 2021 Salı

BARMAN archiver errors: FAILED (duplicates: xxx)

 One of our PostgreSQL cluster BARMAN backup was constantly getting "archiver errors: FAILED (duplicates: xxx)" warning via "barman check serverx" command. Also we saw following warnings in barman logfile;

2021-05-18 09:57:02,365 [13875] barman.wal_archiver INFO:       Error: 00000001000000630000005B is already present in server serverx. File moved to errors directory.

 We realized that our server admin was rebooted serverx without notifying us a week ago. So we did'nt properly shut down postgresql cluster.

While investigating the problem we noticed that; everytime serverx postgresql cluster switches a wal file, it tries to send every wal file under pg_wal directory. So duplicate wal files send to barman and they move to "errors" directory. This happens also manually executing switch wal command: "SELECT pg_switch_wal();"



Our BARMAN version:

$ barman --version

2.12

Barman by 2ndQuadrant (www.2ndQuadrant.com)

Our PostgreSQL cluster version:

postgres=# select version();

                                                 version                                                 

---------------------------------------------------------------------------------------------------------

 PostgreSQL 10.3 on x86_64-pc-linux-gnu, compiled by gcc (GCC) 4.8.5 20150623 (Red Hat 4.8.5-16), 64-bit

(1 row) 

For fixing this issue we had to delete every serverx postgresql cluster backup on barman.

barman@barman:/barman/barman/backup$ rm -rf serverx/


The we setup serverx backup again;

barman@barman:/barman/barman/backup$ mkdir serverx


Manually switched wal file on serverx

SELECT pg_switch_wal(); 


The we saw barman created it's interal backup folders;

barman@barman:/barman/barman/backup/serverx$ ls

base  errors  incoming  streaming  wals 


After some time, we executed barman check command;

barman@barman:/barman/barman/backup/serverx$ barman check serverx

Server serverx:

        PostgreSQL: OK

        superuser or standard user with backup privileges: OK

        wal_level: OK

        directories: OK

        retention policy settings: OK

        backup maximum age: FAILED (interval provided: 8 days, latest backup age: No available backups)

        compression settings: OK

        failed backups: OK (there are 0 failed backups)

        minimum redundancy requirements: FAILED (have 0 backups, expected at least 1)

        ssh: OK (PostgreSQL server)

        not in recovery: OK

        systemid coherence: OK (no system Id stored on disk)

        archive_mode: OK

        archive_command: OK

        continuous archiving: OK

        archiver errors: OK


Then we take a manuel backup;

barman@barman:/barman/barman/backup/serverx$ barman backup serverx

After that everyting was normal; 

barman@barman:/barman/barman/backup/serverx$ barman check serverx

Server serverx:

        PostgreSQL: OK

        superuser or standard user with backup privileges: OK

        wal_level: OK

        directories: OK

        retention policy settings: OK

        backup maximum age: FAILED (interval provided: 8 days, latest backup age: No available backups)

        compression settings: OK

        failed backups: OK (there are 0 failed backups)

        minimum redundancy requirements: FAILED (have 0 backups, expected at least 1)

        ssh: OK (PostgreSQL server)

        not in recovery: OK

        systemid coherence: OK (no system Id stored on disk)

        archive_mode: OK

        archive_command: OK

        continuous archiving: OK

        archiver errors: OK

Hiç yorum yok:

Yorum Gönder