pg_checksums is a renaming of the tool called pg_verify_checksums which has been introduced in Postgres 11. Version 12 is introducing new options and possibilities which explain the renaming, as the tool has become much more multi-purpose.
First, it is now possible to enable and disable checksums for an offline cluster:
commit: ed308d78379008b2cebca30a986f97f992ee6122 author: Michael Paquier <firstname.lastname@example.org> date: Sat, 23 Mar 2019 08:12:55 +0900 Add options to enable and disable checksums in pg_checksums An offline cluster can now work with more modes in pg_checksums: - --enable enables checksums in a cluster, updating all blocks with a correct checksum, and updating the control file at the end. - --disable disables checksums in a cluster, updating only the control file. - --check is an extra option able to verify checksums for a cluster, and the default used if no mode is specified. When running --enable or --disable, the data folder gets fsync'd for durability, and then it is followed by a control file update and flush to keep the operation consistent should the tool be interrupted, killed or the host unplugged. If no mode is specified in the options, then --check is used for compatibility with older versions of pg_checksums (named pg_verify_checksums in v11 where it was introduced). Author: Michael Banck, Michael Paquier Reviewed-by: Fabien Coelho, Magnus Hagander, Sergei Kornilov Discussion: https://postgr.es/m/20181221201616.GD4974@nighthawk.caipicrew.dd-dns.de
Here is how it works. The tool is able to do three modes now in total:
- --check, the default if nothing is specified and what pg_verify_checksums was already able to do. This mode scans all the relation file blocks, reporting any mismatch.
- --enable, which enables data checksums. This rewrites all the relation file blocks, and finishes the operation by updating the control file. Note that this can be take time depending on the size of the instance, and that the tool has no parallel mode.
- --disables which disables data checksums by only updating the control file.
Hence, taking a cluster which has data checksums disabled, here is how to enable them. First the instance to switch needs to be cleanly shut down:
$ pg_controldata -D /my/data/folder/ | grep state Database cluster state: shut down
And then enabling data checksums is a matter of running this command, where the change gets reflected to the control file:
$ pg_checksums --enable -D /my/data/folder/ Checksum operation completed Files scanned: 1144 Blocks scanned: 3487 pg_checksums: syncing data directory pg_checksums: updating control file Checksums enabled in cluster $ pg_controldata -D /my/data/folder/ | grep checksum Data page checksum version: 1
Repeating the same operation results in a failure (disabling data checksums where they are already disabled share the same fate):
$ pg_checksums --enable -D /my/data/folder/ pg_checksums: error: data checksums are already enabled in cluster
Then disabling checksums can be done like that:
$ pg_checksums --disable -D /my/data/folder/ pg_checksums: syncing data directory pg_checksums: updating control file Checksums disabled in cluster $ pg_checksums --disable -D /my/data/folder/ pg_checksums: error: data checksums are already disabled in cluster $ pg_controldata | grep checksum Data page checksum version: 0
Finally, note that the tool is able to handle failures or interruptions in-between gracefully. For example, if the host in the process enabling data checksums is plugged off, then the data folder will remain in a state where they are disabled as the update of the control file happens last. Hence, the operation can be retried from scratch.
pg_verify_checksums is already a rather powerful tool when it comes to backup validation, but enabling checksums after an upgrade was still a barrier. Using Postgres 10, it is possible to use logical replication with a new instance initialized to have data checksums enabled when using initdb, still this takes time and resources as the initial data copy could take long. Note that if you have a cluster which relies on backup tools doing physical copy of relation blocks, pg_rewind being such an tool, it is possible to finish with a cluster which has checksums enabled still some pages could be broken if these pages come from a cluster having checksums disabled. Hence if switching checksums in a set of Postgres nodes, you should be careful to enable checksums consistently on all nodes at the same time.
Now, as data checksums are only compiled when a session flushes a page to disk or at shared buffer eviction, and because WAL do not need to compile checksums even if a full-page write is taken, enabling checksums with minimum downtime becomes much easier by relying on physical replication (WAL streaming). For example, assuming that no physical copy of relation blocks are done across multiple nodes, one could do the following with a set of two nodes, a primary and a standby:
- Both primary and standby have data checksums disabled, and the goal is to enable data checksums.
- First, stop cleanly the standby, and enable checksums on it with --enable.
- Start the standby, and make it catch up with the primary.
- Stop cleanly the primary.
- Promote the standby and do a failover to it.
- Enable checksums on the previous primary.
- Plug it back to the promoted standby, both instances have now checksums enabled.
On top of that, an option to output the progress of any operation run has been added:
commit: 280e5f14056bf34a0f52320f659fb93acfda0876 author: Michael Paquier <email@example.com> date: Tue, 2 Apr 2019 10:58:07 +0900 Add progress reporting to pg_checksums This adds a new option to pg_checksums called -P/--progress, showing every second some information about the computation state of an operation for --check and --enable (--disable only updates the control file and is quick). This requires a pre-scan of the data folder so as the total size of checksummable items can be calculated, and then it gets compared to the amount processed. Similarly to what is done for pg_rewind and pg_basebackup, the information printed in the progress report consists of the current amount of data computed and the total amount of data to compute. This could be extended later on. Author: Michael Banck, Bernd Helmle Reviewed-by: Fabien Coelho, Michael Paquier Discussion: https://firstname.lastname@example.org
Note that this is valid only for --check and --enable. Reports are every second, as follows (this is a fresh instance):
$ pg_checksums --enable --progress 27/27 MB (100%) computed
This requires an extra scan of the data folder so as it is possible to know the total size of all elements having checksums beforehand, costing some extra resources, but it can be useful for reporting when the operation takes a long time on a large cluster.