Reputation: 65550
I am working in Postgres 9.6 with PostGIS. I have a very large table (about 30GB):
Table "public.parcels"
Column | Type | Modifiers
--------------+-------------------------+------------------------------------------------------------------
ogc_fid | integer | not null default nextval('parcels_ogc_fid_seq'::regclass)
wkb_geometry | geometry(Polygon,4326) |
county | character varying |
I would like to create a new table that contains one polygon per county, with the polygon being the union of that county's polygons. I have defined my new table as follows:
gid | integer | not null default nextval('newtable_gid_seq'::regclass)
the_geom | geometry(Polygon,4326) |
I know how to insert one county at a time from the first table into the second table:
INSERT INTO by_county(the_geom) values
((SELECT ST_Union(wkb_geometry) FROM parcels where county='Warwick'));
But how can I do this programmatically for all counties? My current approach is to write a Python script with psycopg2 that first retrieves the DISTINCT
county names, and then runs the above command manually for each county: but I'm wondering if there's a less manual way.
As noted this is a large dataset, and I'm working on a machine with 16GB of RAM. So it may be that my one-command-at-a-time approach is the best after all?
Upvotes: 1
Views: 472
Reputation: 1435
You can try using COPY command. You can give this a shot
psql -d <DATABASE> -c "copy (SELECT country, ST_Union(wkb_geometry) FROM parcels GROUP BY county) to STDOUT" | psql -d <DATABASE> -c "copy by_county from STDIN"
You will have to make sure the columns that you specify are in order.
Upvotes: 0
Reputation: 53754
Since ST_Union is an aggregate function you can do a GROUP BY in your select
INSERT INTO by_county(county, the_geom)
SELECT country, ST_Union(wkb_geometry) FROM parcels GROUP BY county
As noted this is a large dataset, and I'm working on a machine with 16GB of RAM. So it may be that my one-command-at-a-time approach is the best after all?
Probably not because postgresql is very good at handling large datasets. As a rule of thumb, one big query will be faster than a collection of small queries.
Upvotes: 1