SmallFry
SmallFry

Reputation: 153

Slow Oracle Update Compared to Select

We're experiencing very slow performance updating a table in Oracle 9i when a Select statement to return the same rows takes less than 20 seconds. When I say very slow, we have left it running overnight and it still did not finish.

I've simplified the tables and columns to illustrate the problem and to hide the names as they are tables used in a third-party app (however, the data is normally maintained in-house and support calls are expensive). Even in this simple example, the update didn't finish after running for over 4 hours.

CREATE TABLE lookup 
(
  lookup_key       NUMBER(10,0) PRIMARY KEY, 
  lookup_name      VARCHAR2(30)
);

There are c. 34,000 rows in this table

CREATE TABLE products
(
  product_key     NUMBER(10,0) PRIMARY KEY,
  product_val     NUMBER(10,0)
);

There are c. 14.5 million rows in this table

The Select statement is:

SELECT * FROM lookup
WHERE lookup_name <> 'Redundant'
AND NOT EXISTS (SELECT 1 FROM products
where product_val = lookup_key);

The Update statement is:

UPDATE lookup 
SET lookup_name = 'Redundant'
WHERE lookup_name <> 'Redundant'
AND NOT EXISTS (SELECT 1 FROM products
WHERE product_val = lookup_key);

The execution plan is the same between the 2 both in this trivial example and the real world one. All statistics are up to date.

I guess we could do it in PL/SQL but surely it should be quick enough to update all the rows in lookup, that aren't in the product table, to 'Redundant' in SQL? Any ideas or advice gratefully received thanks.

Upvotes: 0

Views: 3810

Answers (3)

Marmite Bomber
Marmite Bomber

Reputation: 21075

Here is your sample setup

CREATE TABLE lookup 
as select 
rownum lookup_key, 'xxxxx'||rownum as lookup_name
from dual connect by level <= 34000;

CREATE TABLE products 
as 
with a as
(select rownum i 
from dual connect by level <= 1450000),
b as 
(select /*+ MATERIALIZE */ rownum j 
from dual connect by level <= 10)
select 
i*10 + j + 34000 product_key,  i*10 + j + 34000 as product_val
from a cross join b;

The update gives this execution plan - note the HASH JOIN ANTI this is a effectiv plan for such a query. If you see something else (e..g FILTER) this is a problem, you perform the FULL TABLE SCAN once for each updatetd row.

-------------------------------------------------------------------------------
| Id  | Operation          | Name     | Rows  | Bytes | Cost (%CPU)| Time     |
-------------------------------------------------------------------------------
|   0 | SELECT STATEMENT   |          | 32957 |  1609K|   554   (3)| 00:00:10 |
|*  1 |  HASH JOIN ANTI    |          | 32957 |  1609K|   554   (3)| 00:00:10 |
|*  2 |   TABLE ACCESS FULL| LOOKUP   | 32957 |  1190K|    19   (0)| 00:00:01 |
|   3 |   TABLE ACCESS FULL| PRODUCTS |  1320K|    16M|   530   (2)| 00:00:10 |
-------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   1 - access("PRODUCT_VAL"="LOOKUP_KEY")
   2 - filter("LOOKUP_NAME"<>'Redundant')

And here the update taking less than 5 seconds to update all rows of the lookup table

SQL> UPDATE lookup
  2  SET lookup_name = 'Redundant'
  3  WHERE lookup_name <> 'Redundant'
  4  AND NOT EXISTS (SELECT 1 FROM products
  5  WHERE product_val = lookup_key);

34000 rows updated.

Elapsed: 00:00:04.97

Note that this is a defensive strategy if you expect lots of updates. The total cost of the update is comparable to a one FULL TABLE SCANon the big table.

In case you expects a small number of updates you may profit from the index proposed in other answer.

 create index products_idx on products(PRODUCT_VAL);

The execution plan changes to NESTED LOOPS ANTI:

-----------------------------------------------------------------------------------
| Id  | Operation           | Name         | Rows  | Bytes | Cost (%CPU)| Time     |
------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT    |              |     1 |    32 |    13   (0)| 00:00:01 |
|   1 |  UPDATE             | LOOKUP       |       |       |            |          |
|   2 |   NESTED LOOPS ANTI |              |     1 |    32 |    13   (0)| 00:00:01 |
|*  3 |    TABLE ACCESS FULL| LOOKUP       |     1 |    19 |    11   (0)| 00:00:01 |
|*  4 |    INDEX RANGE SCAN | PRODUCTS_IDX |    14M|   178M|     2   (0)| 00:00:01 |
------------------------------------------------------------------------------------

 Predicate Information (identified by operation id):
---------------------------------------------------

   3 - filter("LOOKUP_NAME"<>'Redundant')
   4 - access("PRODUCT_VAL"="LOOKUP_KEY")

The update of one row goes instantly ...

SQL> UPDATE lookup
  2  SET lookup_name = 'Redundant'
  3  WHERE lookup_name <> 'Redundant'
  4  AND NOT EXISTS (SELECT 1 FROM products
  5  WHERE product_val = lookup_key);

1 row updated.

Elapsed: 00:00:00.13

... but the elapsed time will lineary increase with the numer of updated rows (and for few thousends the cost will increase the cost of the FTS in the previous execution plan.

Upvotes: 2

BobC
BobC

Reputation: 4424

Add an index to PRODUCT_VAL to avoid scanning the PRODUCTS table for each LOOKUP

Upvotes: 0

Kacper
Kacper

Reputation: 4818

For test purpose I would do:

create table lookup2 as select * from lookup

And run your update.

UPDATE lookup2 
SET lookup_name = 'Redundant'
WHERE lookup_name <> 'Redundant'
AND NOT EXISTS (SELECT 1 FROM products
WHERE product_val = lookup_key);

If that goes fast it means slow down is due to index rebuild on lookup_name, triggers, chained rows or other problems tablespace. If that goes slow it means problem is query itself or there are problems with physical disk performance (what is hard to believe with 34k rows).

I also wonder how this corellated not exists works here. maybe try:

create table tmp as
SELECT lookup_key FROM lookup
WHERE lookup_name <> 'Redundant'
AND NOT EXISTS (SELECT 1 FROM products
where product_val = lookup_key);

merge into lookup l
using (select * from tmp) t
on (l.lookup_key = t.lookup_key)
when matched then update set lookup_name = 'Redundant'
WHERE lookup_name <> 'Redundant';

drop table tmp;

Or the select was fast only for first 100 rows not all of them?

Upvotes: 1

Related Questions