Postgres building an index concurrently

Question

I'm reading this from the Postgres docs:

Building Indexes Concurrently

... PostgreSQL supports building indexes without locking out writes. This method is invoked by specifying the CONCURRENTLY option of CREATE INDEX. When this option is used, PostgreSQL must perform two scans of the table, and in addition it must wait for all existing transactions that could potentially modify or use the index to terminate. Thus this method requires more total work than a standard index build and takes significantly longer to complete. However, since it allows normal operations to continue while the index is built, this method is useful for adding new indexes in a production environment....

In a concurrent index build, the index is actually entered into the system catalogs in one transaction, then two table scans occur in two more transactions. Before each table scan, the index build must wait for existing transactions that have modified the table to terminate. After the second scan, the index build must wait for any transactions that have a snapshot (see Chapter 13) predating the second scan to terminate. Then finally the index can be marked ready for use, and the CREATE INDEX command terminates. Even then, however, the index may not be immediately usable for queries: in the worst case, it cannot be used as long as transactions exist that predate the start of the index build.

What is a system catalog? What is a table scan? So it sounds like the index is build first, then it must wait for existing transactions (ones that occurred during the index build?) to terminate, and then wait for any transactions that have a snapshot predating the second scan to terminate (what does this mean? How does it differ from the first statement). What are these scans?

Vao Tsun · Accepted Answer

What is a system catalog? - https://www.postgresql.org/docs/current/static/catalogs.html
What is a table scan? - It reads table to get values of a column you build index on.
ones that occurred during the index build? - no, ones that could change data after first table scan
what does this mean? It means it waits transactions to end.
What are these scans? First scan reads table before starting to build index concurrently. Allowing changes to table to avoid lock. When build is done it scans roughly saying a difference, apply a short lock and mark index as usable. It is different from create index in a way the last locks table, not permitting any changes to data, while concurrently scans twice, but alows changing data while building index

Postgres building an index concurrently

Answers (1)

Related Questions