Reputation: 3088
I have written an sql for a table that have about 50.000.000 users. The query takes too much time than I expected that it did not finished about 23 hours.
UPDATE users
SET building_id = B.id
FROM (
SELECT *
FROM buildings B
) AS B
WHERE B.city = address_city
AND B.town = address_town
AND B.neighbourhood = address_neighbourhood
AND B.street = address_street
AND B.no = address_building_no
The idea of this sql is to that removing the building/address info from users and instead referencing it to the buildings table.
EXPLAIN
Update on users (cost=22226900.43..22548054.14 rows=15212 width=166)
-> Merge Join (cost=22226900.43..22548054.14 rows=15212 width=166)
Merge Cond: (((users.address_city)::text = (b.city)::text) AND ((users.address_town)::text = (b.town)::text) AND ((users.address_neighbourhood)::text = (b.neighbourhood)::text) AND ((users.address_street)::text = (b.street)::text) AND ((users.address_building_no)::text = (b.no)::text))
-> Sort (cost=21352886.76..21401078.96 rows=96384398 width=156)
Sort Key: users.address_city, users.address_town, users.address_neighbourhood, users.address_street, users.address_building_no
-> Seq Scan on users (cost=0.00..2559921.19 rows=96384398 width=156)
-> Materialize (cost=874013.68..883606.86 rows=9593179 width=63)
-> Sort (cost=874013.68..878810.27 rows=9593179 width=63)
Sort Key: b.city, b.town, b.neighbourhood, b.street, b.no
-> Seq Scan on buildings b (cost=0.00..136253.54 rows=9593179 width=63) (10 rows)
I don't know whether this sql uses the inner SELECT sql for each of the users or caches for transaction. Also, if it caches, does it use the indexes for the cached temp table?
I couldn't write the sql like this:
FROM (
SELECT *
FROM buildings B
WHERE B.city = users.address_city
AND B.town = users.address_town
AND B.neighbourhood = users.address_neighbourhood
AND B.street = users.address_street
AND B.no = users.address_building_no
)
it says that users
cannot be accessed from the inner select. Do you have any suggestion how to access buildings in the inner sql statement.
Upvotes: 1
Views: 129
Reputation: 51519
I presume
create table t as select column_list from a join b on column=column;
alter table t rename to users;
will be faster, and will produce microsecond lock only... Of course if a table is not editable at the moment and there is enough space in temp_tablespace
Upvotes: 1
Reputation: 2473
Not sure but wouldn't this be (at least slightly, if not considerably) faster?
UPDATE users
SET building_id = B.id
FROM buildings B
WHERE B.city = address_city
AND B.town = address_town
AND B.neighbourhood = address_neighbourhood
AND B.street = address_street
AND B.no = address_building_no
If nothing else, it wouldn't require the Materialize
stage as given in EXPLAIN
above.
Upvotes: 0