Michael Paquier - PostgreSQL committer

  • Home
  • Manuals
  • Presentations
  • Projects
  • Resume
  • Tags
  • About

pg_reorg, reorganize a table without locks

You liked this post or you have a comment? Contact information is available here and sponsoring is here.
21 Sep 2012
Tags: pg_reorg, postgres, postgresql, reorganize, vacuum

pg_reorg is a postgresql module developed and maintained by NTT that allows to redistribute a table without taking locks on it. The code is hosted by pg_foundry here. However, pgfoundry uses CVS :(, so I am also maintaining a fork in github in sync with pgfoundry here.

What pg_reorg can do for you is to reorganize a whole table in the same fashion way as a CLUSTER or a VACUUM FULL, while allowing write operations on the table being reorganized at the same time. No locks are needed.

Once you have downloaded the code, you just need to install it on your server.

cd $CODE_FOLDER
make install

Then install the EXTENSION module (for version upper than 9.1) after connecting to the postgres server.

CREATE EXTENSION pg_reorg;

Then, it is possible to perform several types of operations. CLUSTER reorganization on the table $TABLE.

pg_reorg --dbname $DATABASE -t $TABLE

VACUUM FULL reorganization on the table $TABLE.

pg_reorg --dbname $DATABASE -t $TABLE -n

Reorganization of an entire database.

pg_reorg --dbname $DATABASE

The main limitation of this utility is that table being redistributed needs to have a primary key or a non-null unique key.

Then, a little bit more about the technique it uses to reorganize the table. Basically, a temporary copy of the table to be redistributed is created using a CREATE TABLE AS query. The CTAS query definition is changed depending on the distribution user wants. For example, if user wants a redistribution using a different column (option -o), the CTAS is completed with an ORDER BY clause on the wanted column. The indexes of the new table depend on what the user wants.

Then the following operations are done.

  • creation of triggers to register all the DMLs that occur on the former table to an intermediate log table
  • creation of indexes on the temporary table based on what the user wants (new column index, VACUUM FULL…)
  • Apply the logs registered during the index creation and wait for old transactions to finish
  • Swap the names between the freshly-created table and old table
  • Drop the useless objects: the old table, the old triggers and remaining objects

This functionality is particularly handy when you wish to reorganize a huge table. Performing a VACUUM/CLUSTER on it might take time, and your application might need this table to be accessible in write for a maximum amount of time. So pretty useful, uh?

Search

Social

Github

RSS Feeds

Main

Postgres

Sponsor

Unless otherwise specified, the contents of this website are (C)Copyright Michael Paquier 2010-2025 and are licensed for use under CC BY-NC-ND 4.0.