Cyntech
Cyntech

Reputation: 5572

Selecting a subset of records from a very large record set in Oracle runs out of memory

I have a process that is converting dates from GMT to Australian Eastern Standard Time. To do this, I need to select the records from the database, process them and then save them back.

To select the records, I have the following query:

SELECT id,
  user_id,
  event_date,
  event,
  resource_id,
  resource_name
FROM
  (SELECT rowid id,
    rownum r,
    user_id,
    event_date,
    event,
    resource_id,
    resource_name
  FROM user_activity
  ORDER BY rowid)
WHERE r BETWEEN 0 AND 50000

to select a block of 50000 rows from a total of approx. 60 million rows. I am splitting them up because a) Java (what the update process is written in) runs out of memory with too many rows (I have a bean object for each row) and b) I only have 4 gig of Oracle temp space to play with.

In the process, I use the rowid to update the record (so I have a unique value) and the rownum to select the blocks. I then call this query in iterations, selecting the next 50000 records until none remain (the java program controls this).

The problem I'm getting is that I'm still running out of Oracle temp space with this query. My DBA has told me that more temp space cannot be granted, so another method must be found.

I've tried substituting the subquery (what I presume is using all the temp space with the sort) with a view but an explain plan using a view is identical to one of the original query.

Is there a different/better way to achieve this without running into the memory/tempspace problems? I'm assuming an update query to update the dates (as opposed to a java program) would suffer from the same problem using temp space available?

Your assistance on this is greatly appreciated.

Update

I went down the path of the pl/sql block as suggested below:

declare
  cursor c is select event_date from user_activity for update;
begin
  for t_row in c loop
    update user_activity
      set event_date = t_row.event_date + 10/24 where current of c;
    commit;
  end loop;
end;

However, I'm running out of undo space. I was under the impression that if the commit was made after each update, then the need for undo space is minimal. Am I incorrect in this assumption?

Upvotes: 3

Views: 8880

Answers (3)

Tony Andrews
Tony Andrews

Reputation: 132750

How about not updating it at all?

rename user_activity to user_activity_gmt

create view user_activity as
select id,
  user_id,
  event_date+10/24 as event_date,
  event,
  resource_id,
  resource_name
from user_activity_gmt;

Upvotes: 0

Jon Heller
Jon Heller

Reputation: 36922

A single update probably would not suffer from the same issue, and would probably be orders of magnitude faster. The large amount of temp tablespace is only needed because of the sorting. Although if your DBA is so stingy with the temp tablespace you may end up running out of UNDO space or something else. (Take a look at ALL_SEGMENTS, how large is your table?)

But if you really must use this method, maybe you can use a filter instead of an order by. Create 1200 buckets and process them one at a time:

where ora_hash(rowid, 1200) = 1
where ora_hash(rowid, 1200) = 2
...

But this will be horribly, horribly slow. And what happens if a value changes halfway through the process? A single SQL statement is almost certainly the best way to do this.

Upvotes: 6

Sayan Malakshinov
Sayan Malakshinov

Reputation: 8665

Why not just one update or merge? Or you can write anonymous pl/sql block with processing data with cursor For example

declare
  cursor c is select * from aa for update;
begin
  for t_row in c loop
    update aa
     set val=t_row.val||' new value';
  end loop;
  commit;
end;

Upvotes: 0

Related Questions