Twixt3245
Twixt3245

Reputation: 31

Simple Oracle UPDATE Statement unusually bad performance

every month I do a simple update statement on my oracle database. But, since monday it takes very long. The table grows every month by 5 percent. Now there are 8 million records stored.

The Statement:

update /*+ parallel(destination_tab, 4) */ destination_tab dest    
   set (full_name, state) =   
       (select /*+ parallel(source_tab, 4) */ dest.name, src.state   
        from source_tab src   
        where src.city = dest.city);

In real there are 20 fields to update, not only two... but so it looks easier to descripe the problem.

explain plan:

-----------------------------------------------------------------------------------------------------                               
| Id  | Operation                    | Name                 | Rows  | Bytes | Cost (%CPU)| Time     |                                   
-----------------------------------------------------------------------------------------------------                               
|   0 | update statement             |                      |  8517K|  3167M|   579M (50)|999:59:59 |                                   
|   1 |  update                      | destination_tab      |       |       |            |          |
|   2 |   PX COORDINATOR             |                      |       |       |            |          |
|   3 |    PX SEND QC (RANDOM)       | :TQ10000             |  8517K|  3167M|  6198   (1)| 00:01:27 |
|   4 |     px block iterator        |                      |  8517K|  3167M|  6198   (1)| 00:01:27 |
|   5 |      table access full       | DESTINATION_TAB      |  8517K|  3167M|  6198   (1)| 00:01:27 |
|   6 |   table access by index rowid| SOURCE_TAB           |     1 |    56 |     1   (0)| 00:00:01 |
|*  7 |    index unique scan         | CITY_PK              |     1 |       |     1   (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------

Could anyone descripe to me, how this can be? The plan looks very bad! Thank you very very much.

Upvotes: 3

Views: 4292

Answers (4)

tbone
tbone

Reputation: 15473

If this is a data warehouse, I wouldn't do updates, especially not every row in a large table. I'd probably create a materialized view combining the pieces from various base tables, and do a full refresh when needed (non-atomic: truncate + insert append).

Edit: As for WHY the current update approach is taking much longer than usual, my guess is that in previous runs Oracle found a good number of blocks needed for the update in buffer cache, and lately Oracle has had to pull a lot from disk into buffer first. You can look into consistent gets and db block gets (logical io) vs physical io (disk).

Upvotes: 1

Ronnis
Ronnis

Reputation: 12833

Try the following.

merge
 into destination_tab d
using source_tab      s
   on (d.city = d.city)
when matched then
   update 
      set d.state = s.state
    where decode(d.state, s.state, 1, 0) = 0;

Upvotes: 1

Twixt3245
Twixt3245

Reputation: 31

I understand the comments about the sense of a data warehouse and so on. However, I have to do this update in this kind. The update is part of an ETL workflow. I have to copy every month the complete 8 million records of the table "destination". After this step I have to do the UPDATE which makes problems.

I do not understand the problem, that the performance is so bad day-to-day. Usually, the update runs 45 minutes. Now, it runs about 4 hours. But why? There is no sorting necessary, so the famous reason "sorting on disc instead on main memory" is not possible. What is the problem in my case?

Could there be an difference about the performance between normal update (how I do it) and the merge-update?

Upvotes: 0

Brandon
Brandon

Reputation: 10038

You didn't say how long is too long. You are joining an 8 million row table. Not sure how many rows are in source_tab.

I noticed the execution plan indicates a full table scan of destination_tab. Is the city column on the destination_tab table indexed? If not, try adding an index. If it is, Oracle may be ignoring it because it knows it needs to return every value anyway and destination_tab is the driving table.

No matter how you optimize it, this will always degrade in performance as the tables grow because you are updating every row by fetching a value from the same table joined to another. That is, you are always doing N operations where N is the number of rows in destination_tab.

High-level questions/suggestions:

  1. Do you need to update every row every time? Are only certain rows likely to have changed values? If so, can you somehow predict which rows you need to update and limit your updates to it.
  2. Why are the hints there? If performance changes, I would experiment with dropping hints. It's the optimizer's job to find the best plan for you. By using hints, you are telling the optimizer how to do its job. You'd better be right.
  3. You are updating the full_name column on destination_tab to the name column of the same row. But you are obtaining the name column through a join to the table. It may be quicker to take that out of your select and use something like below. This is a guess. It may not matter.

    update destination_tab dest    
     set full_name = name,
       state = 
       (select src.state   
        from source_tab src   
        where src.city = dest.city);
    

Upvotes: 1

Related Questions