Gnudiff
Gnudiff

Reputation: 4305

UPDATE table with FIRST matching row from another table without subquery?

I am trying to adapt route management.

We have route codes that are tied to zipcode ranges. They are used to select correct route for transport to follow, when delivering orders.

A route will have zipcode range it is active for and may have a client -- if that route is specific for that client only.

So, routes table looks like this:

       route_no        | zipfrom | zipto |  client
-----------------------+---------+-------+----------
 ROUTE-CLIENTA-SPECIAL | 12330   | 12350 | CLIENT-A
 ROUTE-CLB1            | 33331   | 44445 | CLIENT-B
 ROUTE-SCENIC          | 11111   | 99999 |
(3 rows)

So, according to this table:

So, for today's orders:

 order_id |  client  | route_no |  zip
----------+----------+----------+-------
 123      | CLIENT-A |          | 12345
 124      | CLIENT-A |          | 33333
 125      | CLIENT-A |          |
 N988     | CLIENT-B |          | 77777
 N987     | CLIENT-B |          | 33335
 N0981    | CLIENT-B |          | 44444
 N0983    | CLIENT-B |          | 12345

To get the possible routes for order N987 of CLIENT-B (zipcode 33333) I can do:

SELECT client,route_no FROM routes 
WHERE (client='CLIENT-B' OR client IS NULL )  
   AND '33333' BETWEEN zipfrom AND zipto ORDER BY client;

  client  |   route_no
----------+--------------
 CLIENT-B | ROUTE-CLB1
          | ROUTE-SCENIC
(2 rows)

Out of this ROUTE-CLB1 is the correct route (it is more specific to client than the default route).

I can select always the correct route by doing ORDER BY and LIMIT 1.

But I need to update the orders table with correct routes. I can do it with subquery:

UPDATE orders AS O 
   SET route_no=(SELECT R.route_no FROM routes R
              WHERE (O.client=R.client OR R.client IS NULL )
                    AND O.zip BETWEEN R.zipfrom AND R.zipto 
              ORDER BY R.client LIMIT 1);

This gives correct answers:

  id   |  client  |         route_no      |  zip
-------+----------+-----------------------+-------
 123   | CLIENT-A | ROUTE-CLIENTA-SPECIAL | 12345
 124   | CLIENT-A | ROUTE-SCENIC          | 33333
 125   | CLIENT-A |                       |
 N988  | CLIENT-B | ROUTE-SCENIC          | 77777
 N987  | CLIENT-B | ROUTE-CLB1            | 33335
 N0981 | CLIENT-B | ROUTE-CLB1            | 44444
 N0983 | CLIENT-B | ROUTE-SCENIC          | 12345
(7 rows)

But this tends to be very slow on larger order lists, as the subquery runs for every row.

How can I improve it? I tried to use DISTINCT ON as described in Select first row in each GROUP BY group? but that didn't seem to work.

Upvotes: 0

Views: 65

Answers (1)

Gordon Linoff
Gordon Linoff

Reputation: 1269803

You might find that this is faster, with the right indexes:

UPDATE orders o
   SET route_no = COALESCE( (SELECT R.route_no
                             FROM routes R
                             WHERE O.client = R.client AND
                                   O.zip BETWEEN R.zipfrom AND R.zipto 
                             LIMIT 1
                            ), 
                            (SELECT R.route_no
                             FROM routes R
                             WHERE R.client IS NULL AND
                                   O.zip BETWEEN R.zipfrom AND R.zipto 
                             LIMIT 1
                            )
                           );

Then you want an index on routes(client, zipfrom, zipto). The index will at least quickly find the candidate rows in routes without scanning the table. Postgres might figure that out with or, but this is worth a try.

Upvotes: 1

Related Questions