Reputation: 31
every month I do a simple update statement on my oracle database. But, since monday it takes very long. The table grows every month by 5 percent. Now there are 8 million records stored.
The Statement:
update /*+ parallel(destination_tab, 4) */ destination_tab dest
set (full_name, state) =
(select /*+ parallel(source_tab, 4) */ dest.name, src.state
from source_tab src
where src.city = dest.city);
In real there are 20 fields to update, not only two... but so it looks easier to descripe the problem.
explain plan:
-----------------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
-----------------------------------------------------------------------------------------------------
| 0 | update statement | | 8517K| 3167M| 579M (50)|999:59:59 |
| 1 | update | destination_tab | | | | |
| 2 | PX COORDINATOR | | | | | |
| 3 | PX SEND QC (RANDOM) | :TQ10000 | 8517K| 3167M| 6198 (1)| 00:01:27 |
| 4 | px block iterator | | 8517K| 3167M| 6198 (1)| 00:01:27 |
| 5 | table access full | DESTINATION_TAB | 8517K| 3167M| 6198 (1)| 00:01:27 |
| 6 | table access by index rowid| SOURCE_TAB | 1 | 56 | 1 (0)| 00:00:01 |
|* 7 | index unique scan | CITY_PK | 1 | | 1 (0)| 00:00:01 |
-----------------------------------------------------------------------------------------------------
Could anyone descripe to me, how this can be? The plan looks very bad! Thank you very very much.
Upvotes: 3
Views: 4292
Reputation: 15473
If this is a data warehouse, I wouldn't do updates, especially not every row in a large table. I'd probably create a materialized view combining the pieces from various base tables, and do a full refresh when needed (non-atomic: truncate + insert append).
Edit: As for WHY the current update approach is taking much longer than usual, my guess is that in previous runs Oracle found a good number of blocks needed for the update in buffer cache, and lately Oracle has had to pull a lot from disk into buffer first. You can look into consistent gets and db block gets (logical io) vs physical io (disk).
Upvotes: 1
Reputation: 12833
Try the following.
merge
into destination_tab d
using source_tab s
on (d.city = d.city)
when matched then
update
set d.state = s.state
where decode(d.state, s.state, 1, 0) = 0;
Upvotes: 1
Reputation: 31
I understand the comments about the sense of a data warehouse and so on. However, I have to do this update in this kind. The update is part of an ETL workflow. I have to copy every month the complete 8 million records of the table "destination". After this step I have to do the UPDATE which makes problems.
I do not understand the problem, that the performance is so bad day-to-day. Usually, the update runs 45 minutes. Now, it runs about 4 hours. But why? There is no sorting necessary, so the famous reason "sorting on disc instead on main memory" is not possible. What is the problem in my case?
Could there be an difference about the performance between normal update (how I do it) and the merge-update?
Upvotes: 0
Reputation: 10038
You didn't say how long is too long. You are joining an 8 million row table. Not sure how many rows are in source_tab.
I noticed the execution plan indicates a full table scan of destination_tab. Is the city column on the destination_tab table indexed? If not, try adding an index. If it is, Oracle may be ignoring it because it knows it needs to return every value anyway and destination_tab is the driving table.
No matter how you optimize it, this will always degrade in performance as the tables grow because you are updating every row by fetching a value from the same table joined to another. That is, you are always doing N operations where N is the number of rows in destination_tab.
High-level questions/suggestions:
You are updating the full_name column on destination_tab to the name column of the same row. But you are obtaining the name column through a join to the table. It may be quicker to take that out of your select and use something like below. This is a guess. It may not matter.
update destination_tab dest
set full_name = name,
state =
(select src.state
from source_tab src
where src.city = dest.city);
Upvotes: 1