Blazemonger
Blazemonger

Reputation: 92993

mysql UPDATE needs to be ORDERED BY data in other tables

Here's an SQL statement (actually two statements) that works -- it's taking a series of matching rows and adding a delivery_number which increments for each row:

SELECT @i:=0;
UPDATE pipeline_deliveries AS d
SET d.delivery_number = @i:=@i+1
WHERE d.pipelineID = 11
ORDER BY d.setup_time;

But now, the client no longer wants them ordered by setup_time. They needed to be ordered according to departure time, which is a field in another table. I can't figure out how to do this.

The MySQL docs, as well as this answer, suggest that in version 4.0 and up (we're running MySQL 5.0) I should be able to do this:

SELECT @i:=0;
UPDATE pipeline_deliveries AS d RIGHT JOIN pipeline_routesXdeliveryID AS rXd
    ON d.pipeline_deliveryID = rXd.pipeline_deliveryID
LEFT JOIN pipeline_routes AS r
    ON rXd.pipeline_routeID = r.pipeline_routeID
SET d.delivery_number = @i:=@i+1
WHERE d.pipelineID = 11
ORDER BY r.departure_time,d.pipeline_deliveryID;

but I get the error #1221 - Incorrect usage of UPDATE and ORDER BY.

So what's the correct usage?

Upvotes: 1

Views: 2648

Answers (3)

Dallas Clarke
Dallas Clarke

Reputation: 269

The hard way:-


    ALTER TABLE eav_attribute_option 
        ADD temp_value TEXT NOT NULL 
        AFTER sort_order;
    UPDATE eav_attribute_option o
        JOIN eav_attribute_option_value ov ON o.option_id=ov.option_id 
        SET o.temp_value = ov.value 
        WHERE o.attribute_id=90;
    SET @x = 0;
    UPDATE eav_attribute_option 
        SET sort_order = (@x:=@x+1) 
        WHERE attribute_id=90 
        ORDER BY temp_value ASC;
    ALTER TABLE eav_attribute_option
        DROP temp_value;

Upvotes: 0

ypercubeᵀᴹ
ypercubeᵀᴹ

Reputation: 115640

You can't mix UPDATE joining 2 (or more) tables and ORDER BY.

You can bypass the limitation, with something like this:

UPDATE 
    pipeline_deliveries AS upd
  JOIN
    ( SELECT t.pipeline_deliveryID, 
             @i := @i+1 AS row_number 
      FROM 
          ( SELECT @i:=0 ) AS dummy
        CROSS JOIN 
          ( SELECT d.pipeline_deliveryID
            FROM 
                pipeline_deliveries AS d 
              JOIN 
                pipeline_routesXdeliveryID AS rXd
                  ON d.pipeline_deliveryID = rXd.pipeline_deliveryID
              LEFT JOIN 
                pipeline_routes AS r
                  ON rXd.pipeline_routeID = r.pipeline_routeID
            WHERE 
                d.pipelineID = 11
            ORDER BY 
                r.departure_time, d.pipeline_deliveryID
          ) AS t
    ) AS tmp
      ON tmp.pipeline_deliveryID = upd.pipeline_deliveryID
SET 
    upd.delivery_number = tmp.row_number ;

The above uses two features of MySQL, user defined variables and ordering inside a derived table. Because the latter is not standard SQL, it may very well break in a feature release of MySQL (when the optimizer is clever enough to figure out that ordering inside a derived table is useless unless there is a LIMIT clause). In fact the query would do exactly that in the latest versions of MariaDB (5.3 and 5.5). It would run as if the ORDER BY was not there and the results would not be the expected. See a related question at MariaDB site: GROUP BY trick has been optimized away.

The same may very well happen in any future release of main-strean MySQL (maybe in 5.6, anyone care to test this?) that will improve the optimizer code.

So, it's better to write this in standard SQL. The best would be window functions which haven't been implemented yet. But you could also use a self-join, which will be not very bad regarding efficiency, as long as you are dealing with a small subset of rows to be affected by the update.

UPDATE 
    pipeline_deliveries AS upd
  JOIN
    ( SELECT t1.pipeline_deliveryID
           , COUNT(*) AS row_number
      FROM
          ( SELECT d.pipeline_deliveryID
                 , r.departure_time
            FROM 
                pipeline_deliveries AS d 
              JOIN 
                pipeline_routesXdeliveryID AS rXd
                  ON d.pipeline_deliveryID = rXd.pipeline_deliveryID
              LEFT JOIN 
                pipeline_routes AS r
                  ON rXd.pipeline_routeID = r.pipeline_routeID
            WHERE 
                d.pipelineID = 11
          ) AS t1
        JOIN
          ( SELECT d.pipeline_deliveryID
                 , r.departure_time
            FROM 
                pipeline_deliveries AS d 
              JOIN 
                pipeline_routesXdeliveryID AS rXd
                  ON d.pipeline_deliveryID = rXd.pipeline_deliveryID
              LEFT JOIN 
                pipeline_routes AS r
                  ON rXd.pipeline_routeID = r.pipeline_routeID
            WHERE 
                d.pipelineID = 11
          ) AS t2
          ON t2.departure_time < t2.departure_time
          OR t2.departure_time = t2.departure_time 
             AND t2.pipeline_deliveryID <= t1.pipeline_deliveryID
          OR t1.departure_time IS NULL
             AND ( t2.departure_time IS NOT NULL
                OR t2.departure_time IS NULL
                   AND t2.pipeline_deliveryID <= t1.pipeline_deliveryID
                 )
      GROUP BY
          t1.pipeline_deliveryID  
    ) AS tmp
      ON tmp.pipeline_deliveryID = upd.pipeline_deliveryID
SET 
    upd.delivery_number = tmp.row_number ;

Upvotes: 2

Woot4Moo
Woot4Moo

Reputation: 24336

Based on this documentation

For the multiple-table syntax, UPDATE updates rows in each table named in table_references that satisfy the conditions. In this case, ORDER BY and LIMIT cannot be used.

Without knowing too much about MySQL you could open up a cursor and process this row by row, or by passing it back to the client code (PHP,Java, etc) that you maintain to handle this processing.

After more digging:

To eliminate the badly optimized subquery, you need to rewrite the subquery as a join, but how can you do that and retain the LIMIT and ORDER BY? One way is to find the rows to be updated in a subquery in the FROM clause, so the LIMIT and ORDER BY can be nested inside the subquery. In this way work_to_do is joined against the ten highest-priority unclaimed rows of itself. Normally you can’t self-join the update target in a multi-table UPDATE, but since it’s within a subquery in the FROM clause, it works in this case.

update work_to_do as target
   inner join (
      select w. client, work_unit
      from work_to_do as w
         inner join eligible_client as e on e.client = w.client
      where processor = 0
      order by priority desc
      limit 10
   ) as source on source.client = target.client
      and source.work_unit = target.work_unit
   set processor = @process_id;

There is one downside: the rows are not locked in primary key order. This may help explain the occasional deadlock we get on this table

Upvotes: 1

Related Questions