user3014914
user3014914

Reputation: 157

SQL how to join only first matching row

I don't understand how to join only the first matching row. If I join by a combination of person_id, and dates, the query returns multiple values. In this case, if join returns multiple values, I want only one row from this join to be displayed. Expected result - combination of assignment_id, effective_start_date, effective_end_date is unique.

The thing that bothers me the most is that I don't have a unique identifier, just dates. If two date ranges from expatriates table fit in assignments effective_start_date, effective_end_date range I only need to display one row.

Please provide me with Oracle syntax (like my code sample).

with assignments (assignment_id, person_id, effective_start_date, effective_end_date) as (
   select 456, 123, date '2015-01-01', date '2015-03-15' from dual union all  
   select 456, 123, date '2015-03-16', date '4712-12-31' from dual union all   
   select  975, 123, date '2015-03-16', date '4712-12-31' from dual            
 ),
expatriates (person_id, home_country, host_country, date_from, date_to, valid_from, valid_to)
as
(
   select 123, 'TEST2', 'TEST2',    date '2015-01-01', date '2015-03-15', date '2015-01-01', date   '2015-03-15' from dual union all
   select 123, 'TEST1', 'TEST1',    date '2015-04-16', date '2016-06-15', date '2015-04-16', date   '2016-06-15' from dual union all
   select 123, 'TEST',  'TEST',   date '2015-03-16', date '2016-04-15', date '2015-03-16', date '2015-04-15' from dual
)

select 
    a.assignment_id,
    a.person_id, 
    a.effective_start_date, 
    a.effective_end_date,
    subq.home_country, 
    subq.host_country, 
    subq.date_from, 
    subq.date_to
from assignments a
     , expatriates subq 
where 
    a.person_id=subq.person_id
    and subq.valid_from <= a.effective_end_date
   and subq.valid_to >= a.effective_start_date

Upvotes: 2

Views: 3078

Answers (2)

Stepan Novikov
Stepan Novikov

Reputation: 1396

I'm not sure about the criteria of filtering records from expatriates table you need, but as an example, we can filter by date_from field:

with assignments (assignment_id, person_id, effective_start_date, effective_end_date) as (
   select 456, 123, date '2015-01-01', date '2015-03-15' from dual union all  
   select 456, 123, date '2015-03-16', date '4712-12-31' from dual union all   
   select  975, 123, date '2015-03-16', date '4712-12-31' from dual            
 ),


expatriates (person_id, home_country, host_country, date_from, date_to, valid_from, valid_to)
as
(
   select 123, 'TEST2', 'TEST2',    date '2015-01-01', date '2015-03-15', date '2015-01-01', date   '2015-03-15' from dual union all
   select 123, 'TEST1', 'TEST1',    date '2015-04-16', date '2016-06-15', date '2015-04-16', date   '2016-06-15' from dual union all
   select 123, 'TEST',  'TEST',   date '2015-03-16', date '2016-04-15', date '2015-03-16', date '2015-04-15' from dual

)

select 
a.assignment_id,
a.person_id, 
a.effective_start_date, 
a.effective_end_date,
subq.home_country, 
subq.host_country, 
subq.date_from, 
subq.date_to
from assignments a, expatriates subq 
where 
a.person_id=subq.person_id
and subq.valid_from <= a.effective_end_date
and subq.valid_to >= a.effective_start_date
and subq.date_from = 
(
  select 
    max(date_from) 
  from expatriates sq2 
    where 
      sq2.person_id = a.person_id and 
      sq2.valid_from <= a.effective_end_date and 
      sq2.valid_to >= a.effective_start_date 
)

SqlFiddle

Upvotes: 0

APC
APC

Reputation: 146239

This is a classic use case for an Analytic ranking function in a sub-query which can be filtered to return the desired sub-set. In this case I have used ROW_NUMBER(), because you haven't provided any criteria for discarding rows, so presumably it does matter which actual row is the "first" row.

select assignment_id,
        person_id, 
        effective_start_date, 
        effective_end_date,
        date_from, 
        date_to,
        home_country, 
        host_country
from (
    select 
        a.assignment_id,
        a.person_id, 
        a.effective_start_date, 
        a.effective_end_date,
        e.date_from, 
        e.date_to,
        e.home_country, 
        e.host_country,
        row_number() over (partition by a.assignment_id, a.effective_start_date 
                            order by e.date_from) rn
    from assignments a
         join expatriates e
    on  (a.person_id=e.person_id )
    where e.valid_from <= a.effective_end_date
        and e.valid_to >= a.effective_start_date
)
where rn = 1
order by 1, 3, 2
/

This returns the following rows from your sample data:

 28* order by 1, 3, 2

ASSIGNMENT_ID  PERSON_ID EFFECTIVE EFFECTIVE DATE_FROM DATE_TO   HOME_ HOST_
------------- ---------- --------- --------- --------- --------- ----- -----
          456        123 01-JAN-15 15-MAR-15 01-JAN-15 15-MAR-15 TEST2 TEST2
          456        123 16-MAR-15 31-DEC-12 16-MAR-15 15-APR-16 TEST  TEST
          975        123 16-MAR-15 31-DEC-12 16-MAR-15 15-APR-16 TEST  TEST

3 rows selected.

SQL> 

Oracle has loads of neat analytic functions. Find out more.

Upvotes: 2

Related Questions