Vlada
Vlada

Reputation: 667

How to optimize this query?

I have tables: A (ID_A, VALID_FROM, DATA ...) B (ID_B, ID, T1, T2, T3, DATE)

Table A can contain historical data (eg. data valid for given period) I need to select records from table B joined with appropritate records from table A (from table A I need row where b.id = a.id_a and record was valid at b.date)

select * 
from B, (select * from (select * from A where a.id_a = b.id and a.valid_from <= b.date order by valid_from desc) where rownum = 1)
where b.id = a.id_a

Upvotes: 1

Views: 158

Answers (3)

Mike Meyers
Mike Meyers

Reputation: 2895

I've taken StevieG's answer and expanded on it. Without a valid_to column there are tricky subqueries to write. I would propose using the LEAD analytic function to find the end of the current validity period and work with that. This is an alternative to the subqueries and the valid_to column.

The LEAD analytic function looks over the rows in the current data set and finds the next valid_from date and uses that as the end of the current period.

My query is shown below. It incorporates the sample data you provided, in a with clause.

with table_a as (
  select 1 as id, 'XXX1' as data, date '2009-01-01' as valid_from from dual union all
  select 1 as id, 'XXX2' as data, date '2009-05-30' as valid_from from dual union all
  select 1 as id, 'XXX3' as data, date '2010-01-11' as valid_from from dual union all
  select 2 as id, 'YYY' as data, date '1999-01-01' as valid_from from dual
),
table_b as (
  select 1 as id, 1 as id_a, date '2009-02-01' as date_col from dual union all
  select 2 as id, 2 as id_a, date '2009-09-12' as date_col from dual union all
  select 3 as id, 1 as id_a, date '2009-06-30' as date_col from dual
)
select *
from table_b b
  join (
    select 
      id, 
      valid_from, 
      lead(valid_from, 1, date '9999-12-31') over (partition by a.id order by a.valid_from) as valid_to
    from table_a a
) a on (a.id = b.id_a)
where 
  a.valid_from <= b.date_col and 
  b.date_col < a.valid_to

Upvotes: 0

StevieG
StevieG

Reputation: 8709

This isn't much more optimal, but is probably more readable:

select *  
from A a, B b
Where 
a.id_a = b.id 
and a.valid_from = (select max(valid_from) 
                    from A 
                    where id_a = b.id 
                    and valid_from <= b.date)
order by valid_from desc

I've seen this problem before, and the best way I know of to optimise it is to put a valid_to column onto table A.

For the latest record, this should contain the biggest date Oracle can handle.
Whenever you create a newer version of the record, update it with the time the new record is created (minus a millisecond to avoid overlaps) so you have something like this:

ID Valid_from                Valid_to
1  01/01/2011 12.34.56.0000  02/01/2011 12.34.56.0000
1  02/01/2011 12.34.56.0001  03/01/2011 12.34.56.0000
1  03/01/2011 12.34.56.0001  31/12/9999 23.59.59.9999

Then you can query it like this:

select *  
from A a, B b
Where 
a.id_a = b.id 
and b.date between a.valid_from and a.valid_to
order by valid_from desc

With an index on the date columns, the performance should be ok..

Upvotes: 1

Andy Hunt
Andy Hunt

Reputation: 1073

Sounds like you're looking for a JOIN: http://download.oracle.com/docs/cd/B19306_01/server.102/b14200/queries006.htm

Upvotes: 2

Related Questions