user3124206
user3124206

Reputation: 475

Speeding up aggregations for a large table in Oracle

I am trying to see how to improve performance for aggregation queries in an Oracle database. The system is used to run financial series simulations.

Here is the simplified set-up:

  1. The first table table1 has the following columns

    date | id | value

    It is read-only, has about 100 million rows and is indexed on id, date

  2. The second table table2 is generated by the application according to user input, is relatively small (300K rows) and has this layout:

    id | start_date | end_date | factor

After the second table is generated, I need to compute totals as follows:

    select date, sum(value * nvl(factor,1)) as total
    from table1 
    left join table2 on table1.id = table2.id 
    and table1.date between table2.start_date and table2.end_date group by date 

My issue is that this is slow, taking up to 20-30 minutes if the second table is particularly large. Is there a generic way to speed this up, perhaps trading off storage space and execution time, ideally, to achieve something running in under a minute?

I am not a database expert and have been reading Oracle performance tuning docs but was not able to find anything appropriate for this. The most promising idea I found were OLAP cubes but I understand this would help only if my second table was fixed and I simply needed to apply different filters on the data.

Upvotes: 3

Views: 5622

Answers (2)

Dave Costa
Dave Costa

Reputation: 48121

First, to provide any real insight, you'd need to determine the execution plan that Oracle is producing for the slow query.

You say the second table is ~300K rows - yes that's small compared to 100M but since you have a range condition in the join between the two tables, it's hard to say how many rows from table1 are likely to be accessed in any given execution of the query. If a large proportion of the table is accessed, but the query optimizer doesn't recognize that, the index may actually be hurting instead of helping.

You might benefit from re-organizing table1 as an index-organized table, since you already have an index that covers most of the columns. But all I can say from the information so far is that it might help, but it might not.

Upvotes: 2

Maheswaran Ravisankar
Maheswaran Ravisankar

Reputation: 17920

Apart from indexes, Also try below. My two cents!

  1. Try running this Query with PARALLEL option employing multiple processors. /*+ PARALLEL(table1,4) */ .

  2. NVL has been done for million of rows, and this will be an impact to some extent, any way data can be organised?

  3. When you know the date in Advance, probably you divide this Query into two chunks, by fetching the ids in TABLE2 using the start date and end date. And issue a JOIN it to TABLE1 using a view or temp table. By this we use the index (with id as leading edge) optimally

Thanks!

Upvotes: 1

Related Questions