Lonzak
Lonzak

Reputation: 9816

sql query optimization (oracle)

It has been a while since my sql days thus I am wondering whether it is possible to further optimize the following query? The goal is to collect all accounts for each accountant including all the bookings and movements associated with it. Performance/query-time is very important since there are '3 digit'-million datasets...

select Accountant.person_id, 
    Account.account_id, 
    Account.number, 
    Account.balance,
    Account_Type.type_number, 
    Booking.booking_id, 
    Booking.amount, 
    Movement.movement_date, 
    Movement.movement_desc
from Accountant
join Account on Accountant.person_id = Account.person_id
join Account_Type on Account.account_type_id = Account_Type.account_type_id
left outer join Booking on Account.account_id = Booking.account_id
left outer join Movement on Booking.movement_id = Movement.movement_id;

The entity model looks something like that: enter image description here

UPDATE: Since some of you are wondering: Yes I am simply selecting hundreds of million of rows since the query is used to migrate data. The data queried is used to construct a new data structure which is put in another database...

  1. To allow returning accounts with no Bookings I added the outer joins. Is that the right syntax?
  2. Here is the explain plan - seeing any optimization possibilities? enter image description here
  3. After adding some missing indices the query takes (in an external tool) about 1/2 hour. In java a memory error is thrown at some point. Any hints (except increasing memory) how to optimize that?

Upvotes: 0

Views: 820

Answers (2)

Thorsten Kettner
Thorsten Kettner

Reputation: 94914

As others have already mentioned: It is strange, you want to select hundreds of millions of records in one go.

Aside from that:

  1. The left outer join on table Booking will only work, if you also outer join table Movement.
  2. As you want all records, only full table scans (or fast full index scans for that matter) make sense. Check the explain plan if this is the case. (It should.) Otherwise use /*+full(tablename)*/.
  3. As you will use full table scans you may want to have them run in parallel. Check the explain plan if this is already the case. Otherwise use /*+parallel(tablename,factor)*/.
  4. In case the tables have many columns, it might be good to have indexes containing the desired columns, so fast full index scans instead of full table scans can be used and less disc blocks need to be read thus.
  5. You can reduce disc reads by compressing tables (Oracle 11g and up).

Upvotes: 3

Ryan
Ryan

Reputation: 3982

The join syntax is clearer and recommended. I don't think you can optimise the query itself any further, but you could check the DB schema, or investigate alternatives.

DB Schema:

  • are all IDs marked primary (i.e.indexed and clustered)?
  • can you check the query plans to ensure the query is being executed optimally?

You could also consider a reporting table that is built at the end of every day (or more frequently if required) - and query from that. Oracle supports materialised views that might help.

Finally, a query that returns n,000,000 records really isn't much use to anyone, so you might reconsider your logic. If you're presenting these records to a user, you need to page them (so limit the query to a page size) - or if you're presenting summary info, rewrite the query to use aggregate functions (sum/max/avg etc.) Don't use code to do what a DB does best.

Upvotes: 2

Related Questions