PostgreSQL: UPDATE large table

Question

I have a large PostgreSQL table of 29 million rows. The size (according to the stats tab in pgAdmin is almost 9GB.) The table is post-gis enabled with an empty geometry column.

I want to UPDATE the geometry column using ST_GeomFromText, reading from X and Y coordinate columns (SRID: 27700) stored in the same table. However, running this query on the whole table at once results in 'out of disk space' and 'connection to server lost' errors... the latter being less frequent.

To overcome this, should I UPDATE the 29 million rows in batches/stages? How can I do 1 million rows (the FIRST 1 million), then do the next 1 million rows until I reach 29 million?

Or are there other more efficient ways of updating large tables like this?

I should add, the table is hosted in AWS.

My UPDATE query is:

UPDATE schema.table
SET geom = ST_GeomFromText('POINT(' || eastingcolumn || ' ' || northingcolumn || ')',27700);

Theo F · Accepted Answer

From my original question:

However, running this query on the whole table at once results in 'out of disk space' and 'connection to server lost' errors... the latter being less frequent.

Turns out our Amazon AWS instance database was running out of space, stopping my original ST_GeomFromText query from completing. Freeing up space fixed it.

On an important note, as suggested by @mlinth, ST_Point ran my query far quicker than ST_GeomFromText (24mins vs 2hours).

My final query being:

UPDATE schema.tablename
SET geom = ST_SetSRID(ST_Point(eastingcolumn,northingcolumn),27700);

PostgreSQL: UPDATE large table

Answers (2)

Related Questions