user2820853
user2820853

Reputation: 17

Conditional SQL joins

I have what I imagine is a somewhat basic SQL question and hoping for some help. I'm using MSSQL.

I have two tables :

Table A                          Table B
----------------------------------------------
A.ID |A.Date|                 B.ID|B.Date|B.Number

A.ID is unique; no duplicate rows. B.ID is not unique so joining the tables on that column could result in multiple rows returned. To grab the proper record, its necessary to join on both columns but this also has issues of undesired results. For example:

Table A                |         Table B
----------------------------------------------
1     |01-01-2014      |     1    |01-01-2014|25
2     |01-30-2014      |     1    |04-30-2013|40
3     |03-24-2014      |     1    |10-12-2018|12
4     |05-28-2014      |     2    |11-17-2013|55

The desired outcome is to join on both columns and return a single row. The problem I'm running into is if I join on both columns being equal using the sample data, only the first row will return. If i set the ID columns to equal and use a <=, multiple rows will return for ID 1 which is undesired. I can't use a max date because some fields contain future dates which shouldn't return records until the two date fields are equal. In the sample data, desired results would return number 25 and 55. So essentially I need to join on the ID columns being equal and the date field being equal but if there is no equal date in table B, return the number of the latest date, assuming it is not a future date.

Hopefully that makes sense. Thanks in advance.

Upvotes: 0

Views: 108

Answers (4)

ngreen
ngreen

Reputation: 1759

I don't have a SQL Server instance available to me at the moment, so I'm going to post syntax that worked for PostgreSQL; it should translate fairly well. My goal was to derive the primary key for Table B and then select the row using the primary key. It works with the sample data you provided. I don't know what rule you have to determine whether a date is too far into the future, so I put in a hard-coded date which can be swapped for a bind variable easily enough.

with KB as (
  select id as id, max(dt) as dt
  from (
    select B.id, B.dt, B.num
    from B
      inner join A on A.id = B.id and A.dt = B.dt
    union
    select B.id, max(B.dt), B.num from B
      inner join A on A.id = B.id and A.dt != B.dt
    where B.dt < convert(datetime, '2014.10.313', 102)
    group by B.id, B.num
  ) subquery
  group by id
)
select B.id, B.dt, B.num
from B
  inner join KB on KB.id = B.id and X.dt = B.dt;

How it works:

The first inner query just blindly grabs exact matches. They're needed, so why not? The second inner query grabs all the inexact matches (ID is equal but date is not). Once the set of equal and the set of not equal matches are combined, the outer query picks the newest date (max) for each ID. At that point, every row necessarily has a single date for each ID, so the only thing left to do is get the rows that match exactly.

I have no idea what the explain plan looks like for this. It shouldn't be bad, but I could be completely wrong.

Upvotes: 0

Brandon Spilove
Brandon Spilove

Reputation: 1569

Yeah this is a little tricky but this should do it:

with equalvalues as (
select b.* from a
inner join b on a.ID=b.ID and a.adate=b.bdate
),
latestdates as (
select b.id, MAX(b.bdate) as maxbdate
from b
where b.bdate<=GETDATE()
group by b.ID
)
select number from equalvalues
union all
select b.number from b
inner join latestdates d on d.ID=b.ID and d.maxbdate=b.bdate
where b.ID not in (select ID from equalvalues)

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269443

I would recommend using a correlated subquery or the apply operator. Here is the method:

select a.*, b.*
from tablea a outer apply
     (select top 1 b.*
      from tableb b
      where b.id = a.id and
            b.date <= a.date
      order by b.date desc
     ) b;

Upvotes: 0

Felix Pamittan
Felix Pamittan

Reputation: 31879

Try this:

-- build sample data
create table #TableA(
    ID int,
    [Date] smalldatetime
)
create table #TableB(
    ID int,
    [Date] smalldatetime,
    number int
)

insert into #TableA
select 1, '1/1/2014' union all
select 2, '1/30/2014' union all
select 3, '3/24/2014' union all
select 4, '5/28/2014'

insert into #TableB
select 1, '1/1/2014', 25 union all
select 1, '4/30/2013', 40 union all
select 1, '10/12/2018', 12 union all
select 2, '11/17/2013', 55

-- start
;with cte(ID, [Date], Number, rn) as(
    select
        a.id,
        b.date,
        number,
        row_number() over(
            partition by a.id 
            order by
                case
                    when a.date = b.date then dateadd(d, 1, getdate())
                    else b.date
                end
            desc                
        )
    from #TableA a
    inner join #TableB b
        on b.id = a.id
    where
        a.date <= getdate()
        and b.date <= getdate()
)
select 
    id,
    date,
    number
from cte where rn = 1
-- end

-- drop temp tables
drop table #TableA
drop table #TableB

Upvotes: 0

Related Questions