Cuishan Liu
Cuishan Liu

Reputation: 31

oracle update with correlated query

What is the correct answer? Choose two.

Examine this SQL statement:

UPDATE orders o
SET customer_name = (
    SELECT cust_last_name FROM customers  WHERE customer_id=o.customer_id
);

Which two are true?

I know B is correct, but all other selection I believe is incorrect.

Upvotes: 3

Views: 2073

Answers (2)

Marmite Bomber
Marmite Bomber

Reputation: 21095

The only second true answer is

F. this is a wrong desing denormalizing the CUSTOMER_NAME in the orders table and conflicting therefor with the normal form.

The answer C could be right somewhere in the times of Oracle 8 (i.e. 20 years ago) but now it is definitively wrong!.

Oracle introduces the scalar subquery caching event for the reason to limit the number of executions of the subqueries.

Here a Simple Demonstration

This setup in Oracle 19.2 has 10K orders and 1K customers.

create table customers as
select rownum customer_id, 'cust_'||rownum customer_name from dual connect by level <= 1000;

create index customers_idx1 on customers (customer_id);

create table orders as
select rownum order_id, trunc(rownum/10)+1 customer_id, cast (null as varchar2(100)) customer_name
from dual connect by level <= 10000;

The update is performed on 100K rows as expected

UPDATE /*+ gather_plan_statistics */ orders o
SET customer_name = (
    SELECT customer_name FROM customers  WHERE customer_id=o.customer_id
);

The hint gather_plan_statistics collects teh execution statistics which we will examine.

SQL_ID  8r610vz9fknr6, child number 0
-------------------------------------
UPDATE /*+ gather_plan_statistics */ orders o SET customer_name = (     
SELECT customer_name FROM customers  WHERE customer_id=o.customer_id )

Plan hash value: 3416863305

--------------------------------------------------------------------------------------------------------------------------
| Id  | Operation                            | Name           | Starts | E-Rows | A-Rows |   A-Time   | Buffers | Reads  |
--------------------------------------------------------------------------------------------------------------------------
|   0 | UPDATE STATEMENT                     |                |      1 |        |      0 |00:00:00.18 |   60863 |     21 |
|   1 |  UPDATE                              | ORDERS         |      1 |        |      0 |00:00:00.18 |   60863 |     21 |
|   2 |   TABLE ACCESS FULL                  | ORDERS         |      1 |  10000 |  10000 |00:00:00.01 |      21 |     18 |
|   3 |   TABLE ACCESS BY INDEX ROWID BATCHED| CUSTOMERS      |   1001 |      1 |   1000 |00:00:00.01 |    2020 |      3 |
|*  4 |    INDEX RANGE SCAN                  | CUSTOMERS_IDX1 |   1001 |      1 |   1000 |00:00:00.01 |    1020 |      3 |
--------------------------------------------------------------------------------------------------------------------------

Predicate Information (identified by operation id):
---------------------------------------------------

   4 - access("CUSTOMER_ID"=:B1)

The importatnt information is in the column Start, we see that the table customers were accessed only 1001 time, i.e. nearly only once per customer and not once per order.

Upvotes: 0

GMB
GMB

Reputation: 222592

Consider option C:

C. The subquery is executed for every updated row in the ORDERS table.

You said:

NOT for every updated row, it is for every row that the outer query returns.

Yes. The subquery is indeed executed for every row in the outer query (let apart possible optimizations applied by the database). And every row in the outer query is updated - as you spotted, since you already, and correctly, selected option B: all existing rows in the ORDERS table are updated.

Note: your arguments against options A, D and 3 are valid.

Upvotes: 1

Related Questions