Peter
Peter

Reputation: 37

Update statement using IN clause issue

I have an update statement that goes thus:

update tableA 
    set val1='X', val2='Y' 

where id in (
    select id from tableA A 

    LEFT JOIN tableB B ON A.col1=B.col1 and A.col2=B.col2 

    where A.col3='xx' and B.col3= 'YY')

Now, the inner SELECT statement runs in 10 minutes returning 1000 records (both tableA and tableB have about 10mil records each)

The whole update statement runs for 3 hours ( have not waited yet for completion)

Any ideas?

Upvotes: 3

Views: 19580

Answers (6)

Willian Carvalho
Willian Carvalho

Reputation: 436

I know this thread is a bit old, but Id like to share what happened to me (specifically today hehe).

We were facing the same problem. Update was taking forever to run! So we found out that the table we were trying to update was locked. However no error message was thrown by DB2.

hope that Ive had helped somehow! cheers. ~chamb~

Upvotes: 2

Ian Bjorhovde
Ian Bjorhovde

Reputation: 11042

Others have suggested using EXPLAIN to help identify where the query is slow, which is certainly a good idea.

One other suggestion, although uou don't say what version of DB2 you're using, or what platform it is on:

You might want to try using the MERGE statement, which is the ANSI SQL statement for an update join (this has been available in DB2 for Linux/Unix/Windows since V8.2):

The merge statement would look something like (not tested):

merge into tableA a
   using (select id, col1, col2 from tableB where col3 = 'YY') as b
   on (a.col1 = b.col1 and a.col2 = b.col2 and a.col3 = 'xx')
   when matched 
   then update set val1 = 'X', val2 = 'Y' 

Upvotes: 0

Aaron Digulla
Aaron Digulla

Reputation: 328624

There is probably no index on the column tableA.id or the type of tableA.id doesn't match the type returned by the select.

[EDIT] Alternatively, you can try this weird syntax:

update (
    select val1, val2 from tableA A 

    LEFT JOIN tableB B ON A.col1=B.col1 and A.col2=B.col2 

    where A.col3='xx' and B.col3= 'YY'
) tmp
set val1='X', val2='Y' 

This creates a temporary table which is still linked to the original table, so you can update the values which the select returns and they will show up in the original table.

[EDIT2] I missed the fact that you're selecting and updating the same table (i.e. id is the same column). In this case, the type obviously doesn't matter and you shouldn't even need an index (since the select already returns the correct rows).

Try EXPLAIN PLAN to see whether something else is going on.

Also, you might get in conflict with another process which also updates the same table (i.e. you have a lock somewhere). AQT has a Monitor which can show these things. If you can, get AQT and use that. It has excellent support for DB2 and is better than anything I've seen out there so far.

Upvotes: 2

a'r
a'r

Reputation: 36999

You should try an update join as this should improve the explain plan.

update tableA A, tableB B
set A.val1='X', A.val2='Y' 
where A.col1 = B.col1 and A.col2 = B.col2
  and A.col3 = 'xx' and B.col3 = 'YY'

Indexes will clearly help with this join as others have already stated. Please copy the EXPLAIN results if you want us to check that indexes are in use.

Upvotes: 0

mwigdahl
mwigdahl

Reputation: 16578

Do you have indexes on any of the columns? If you don't have indexes on tableA.col1, tableB.col1, tableA.col2, tableB.col2, tableA.col3, and tableB.col3 this could explain the long seek times. If you don't have indexes for those comparisons, you will be forced to scan over the entire table, which will be slow.

Also, even if you have indexes you should check to see if your statistics are up to date; if they're not, that can cause problems even if the physical structure of the indexes is otherwise fine.

You don't mention the database you're using, but if you have the means to get it to tell you the execution plan of your query, you can use that to determine what it's doing and take appropriate action to add the indexes you'll need.

Upvotes: 0

Frank Farmer
Frank Farmer

Reputation: 39356

I've had success in the past with breaking the subquery out, and storing its output to a temporary table. e.g.

CREATE TEMPORARY TABLE my_ids (id INT);

INSERT INTO my_ids select id from tableA A 

    LEFT JOIN tableB B ON A.col1=B.col1 and A.col2=B.col2 

    where A.col3='xx' and B.col3= 'YY';

update tableA 
    set val1='X', val2='Y' 
WHERE id IN (SELECT id FROM my_ids);

If your specific SQL flavor has some sort of UPDATE JOIN-esque syntax (e.g. UPDATE tableA USING my_ids) that may also be worth trying

Upvotes: 0

Related Questions