sasecse
sasecse

Reputation: 69

Join two tables without unique records and compare two columns

I tried to Join two tables without unique fields and compare it based on the date.

Consider the following

Table A

System ID   Start   Process
132 1/17/2010   10/17/2010
132 2/21/2011   2/23/2011
132 1/16/2010   1/30/2010
132 1/16/2006   1/16/2006
132 1/16/2011   1/24/2010
132 1/30/2011   1/31/2011
132 1/30/2008   2/6/2008
132 2/5/2007    2/8/2007
132 2/5/2009    2/6/2009

Table B

System ID   T start T end
132 05/01/10    4/30/2011
132 05/01/11    4/30/2013

Output Table

System ID   Start       Process     Check
132 1/17/2010       10/17/2010      Y
132 2/21/2011       2/23/2011       Y
132 1/16/2010       1/30/2010       Y
132 1/16/2006       1/16/2006       N
132 1/16/2011       1/24/2010       Y
132 1/30/2011       1/31/2011       Y
132 1/30/2008       2/6/2008        N
132 2/5/2007        2/8/2007        N
132 2/5/2009        2/6/2009        N

I am trying to consider all the records in table A and compare the Table As column “Start” with the Tables Bs “T start” and “T End”, if the Table A’s “Start” appears between Table Bs “T Start” and “ T End” then the Column “Check” in the output table will be updated as “Y”.

I understood that the Tables can be joined only based on the unique fields ,so I tried to add an auto increment field in one of the table to avoid the duplication but the problem is both the tables consist of non-unique records and each record should be considered to identify the date. The total number of records in each table is around 60K. I am bit struck. I appreciate your kind help.

Upvotes: 0

Views: 2615

Answers (2)

Gordon Linoff
Gordon Linoff

Reputation: 1269503

It looks like you are trying to determine where dates in the two tables overlap. The following should do this:

select a.*, (case when b.start is not null then 'Y' else 'N' end) as check
from TableA a left outer join
     TableB b
     on a.start <= b.Tend and
        a.end >= b.Tstart and
        (a.start <> b.Tstart and a.end <> b.Tend) and
        a.SystemId = b.SystemId;

Tables can be joined using any type of relation. When you say on "unique fields" you might be referring to an "equijoin". This is when the join conditions consist of "equals" between various fields. The advantage to equijoins is that they are more efficient to process (under most circumstances).

Upvotes: 1

roman
roman

Reputation: 117345

When you checking for overlapping, you must be careful when joining tables, if, for example, you have 2 range of dates in table B which are overlapping with given range in table A, row from table A will appear in results twice. Hence, I suggest to use exists for checking overlapping:

select
    A.*,
    case
        when exists
        (
             select *
             from TableB as B
             where B.[System ID] = A.[System ID] and B.[T end] >= A.[Start] and B.[T start] <= A.[Process]
        )
        then 'Y'
        else 'N'
    end as [Checked]
from TableA as A

or use apply

select
    A.*, case when B.[System ID] is not null then 'Y' else 'N' end as [Checked]
from TableA as A
    outer apply (
        select top 1 *
        from TableB as B
        where B.[System ID] = A.[System ID] and B.[T end] >= A.[Start] and B.[T start] <= A.[Process]
    ) as B

see sqlfiddle example with your data.

Upvotes: 0

Related Questions