Vereonix
Vereonix

Reputation: 1424

JOIN on row by which row in the join query has the MAX/MIN value of all rows

If the query I want to join on returns:

IDApplication  ContactDate  CInfo
1              01/06/2016   pie
1              10/01/2016   cake
1              03/02/2015   banana
2              03/06/2016   cake
2              23/12/2015   apple



IDApplication  ReplyDate    RInfo
1              30/05/2016   circle
1              03/05/2016   square
1              04/02/2015   triangle
1              14/01/2016   pentagon
2              04/06/2016   square
2              01/02/2016   pentagon
2              10/06/2016   circle

I need this to be returned:

IDApplication  ContactDate  CInfo       ReplyDate    RInfo    
1              01/06/2016   pie         NULL         NULL
1              10/01/2016   cake        30/05/2016   circle
1              03/02/2015   banana      04/02/2015   triangle
2              03/06/2016   cake        10/06/2016   square
2              23/12/2015   apple       01/02/2016   pentagon

I need it to return the second table/query info where its date is larger than any corresponding application date in the first table, but which isn't larger than any following date in the first table.

So for the first record above, it is NULL because there is no reply info with a date after 01/06/2016 in the reply table (so no reply has been made), but the second record has the reply date of 30/05/2016 as that is the largest reply date for that application. More importantly for the 5th record the reply date is 01/02/2016, this date is larger than the contact date but not THE largest reply date for application 2 which is 10/06/2016, but as another contact date for application 2 is in between these 2 it needs to show the largest date before the following contact date.

This is logic which is making my brain hurt.

I've gotten as far as joining the second query on where reply date is larger than the contact date, but that causes it to make rows appear for all dates larger.

I need to join on the record with a date larger than the base tables date but of these the largest where it isn't larger than the next largest base table date.

Upvotes: 3

Views: 117

Answers (4)

Joe Farrell
Joe Farrell

Reputation: 3542

My approach to problems like this is typically to break them into small steps, each of which I can implement as a CTE, so I get something that's pretty easy to read through and understand. I can always try to collapse it into fewer steps later if need be. Here's a possible solution. See the comments for an explanation of how it works.

--------------------------------------------------------------------------------
-- Set up the sample data from the question.
--------------------------------------------------------------------------------
declare @Contact table (IDApplication int, ContactDate date, CInfo varchar(32));
declare @Reply table (IDApplication int, ReplyDate date, RInfo varchar(32));

insert @Contact values
    (1, '2016-06-01',' pie'),
    (1, '2016-01-10', 'cake'),
    (1, '2015-02-03', 'banana'),
    (2, '2016-06-03', 'cake'),
    (2, '2015-12-23', 'apple');
insert @Reply values
    (1, '2016-05-30', 'circle'),
    (1, '2016-05-03', 'square'),
    (1, '2015-02-04', 'triangle'),
    (1, '2016-01-14', 'pentagon'),
    (2, '2016-06-04', 'square'),
    (2, '2016-02-01', 'pentagon'),
    (2, '2016-06-10', 'circle');

--------------------------------------------------------------------------------
-- Step 1: Sequence each group of contacts by contact date.
--------------------------------------------------------------------------------
with OrderedContactCTE as
(
    select
        *,
        [Sequence] = row_number() over (partition by IDApplication order by ContactDate)
    from
        @Contact
),

--------------------------------------------------------------------------------
-- Step 2: Match each contact with the subsequent contact (where one exists)
--         having the same IDApplication value. The date of the subsequent
--         contact will act as the upper bound on reply dates that are valid for
--         the original contact. Assign each contact a unique identifier that
--         we'll use in the following step.
--------------------------------------------------------------------------------
PairedContactCTE as
(
    select
        UniqueID = row_number() over (order by Contact.IDApplication, Contact.[Sequence]),
        Contact.IDApplication,
        Contact.ContactDate,
        Contact.CInfo,
        NextContactDate = NextContact.ContactDate
    from
        OrderedContactCTE Contact
        left join OrderedContactCTE NextContact on
            Contact.IDApplication = NextContact.IDApplication and
            Contact.[Sequence] = NextContact.[Sequence] - 1
),

--------------------------------------------------------------------------------
-- Step 3: Match every contact with all replies that are strictly after the 
--         original contact date and, where applicable, strictly before the 
--         subsequent contact date. For each unique contact, sequence the 
--         replies in reverse order by reply date.
--------------------------------------------------------------------------------
OrderedResponseCTE as
(
    select
        Contact.*,
        Reply.ReplyDate,
        Reply.RInfo,
        [Sequence] = row_number() over (partition by Contact.UniqueID order by Reply.ReplyDate desc)
    from
        PairedContactCTE Contact
        left join @Reply Reply on
            Contact.IDApplication = Reply.IDApplication and
            Contact.ContactDate < Reply.ReplyDate and
            (
                Contact.NextContactDate is null or
                Contact.NextContactDate > Reply.ReplyDate
            )
)

--------------------------------------------------------------------------------
-- Step 4: Finally, select each contact and the date/info of the latest reply
--         which is an eligible match for that contact.
--------------------------------------------------------------------------------
select
    IDApplication,
    ContactDate,
    CInfo,
    ReplyDate,
    RInfo
from 
    OrderedResponseCTE 
where 
    [Sequence] = 1;

Upvotes: 2

Peter M.
Peter M.

Reputation: 713

I don't have a SQL Server instance to test. Let me know how close this gets (extension of Tim's solution)

SELECT c1.IDApplication, c1.ContactDate, c1.CInfo, r1.ReplyDate, r1.RInfo
FROM contact_table c1
LEFT JOIN
reply_table r1
    ON c1.IDApplication = r1.IDApplication AND
       r1.ReplyDate > c1.ContactDate AND
       r1.ReplyDate < ( SELECT isnull(MIN(c2.ContactDate),'31-DEC-9999')
                       FROM contact_table c2
                       WHERE c2.ContactDate > c1.ContactDate AND
                             c2.IDApplication = c1.IDApplication ) AND
       NOT EXISTS ( SELECT null
                    FROM reply_table r2
                    WHERE r2.IDApplication = r1.IDApplication AND
                          r2.ReplyDate > r1.ReplyDate AND
                          r2.ReplyDate < ( SELECT isnull(MIN(c2.ContactDate),'31-DEC-9999')
                                           FROM contact_table c2
                                           WHERE c2.ContactDate > c1.ContactDate AND
                                                 c2.IDApplication = c1.IDApplication ) )

Upvotes: 1

Andrey Nadezhdin
Andrey Nadezhdin

Reputation: 126

If you record rely on some data from record above its better to implement some stored procedure logic. Something like this:

/*
CREATE TABLE App (Id INT, ContractDate DATETIME, CInfo VARCHAR(100))

CREATE TABLE Reply (Id INT, ReplyDate DATETIME, RInfo VARCHAR(100))

INSERT App SELECT 1, '06/01/2016',' pie'
INSERT App SELECT 1, '01/10/2016', 'cake'
INSERT App SELECT 1, '02/03/2015', 'banana'
INSERT App SELECT 2, '06/03/2016', 'cake'
INSERT App SELECT 2, '12/23/2015', 'apple'


INSERT Reply SELECT 1, '05/30/2016', 'circle'
INSERT Reply SELECT 1, '05/03/2016', 'square'
INSERT Reply SELECT 1, '02/04/2015', 'triangle'
INSERT Reply SELECT 1, '01/14/2016', 'pentagon'
INSERT Reply SELECT 2, '06/04/2016', 'square'
INSERT Reply SELECT 2, '02/01/2016', 'pentagon'
INSERT Reply SELECT 2, '06/10/2016', 'circle'
*/
--SELECT * FROM App
DECLARE @AppReply TABLE (Id INT, ContractDate DATETIME, CInfo VARCHAR(100), ReplyDate DATETIME, RInfo VARCHAR(100))

DECLARE
    @Id INT,
    @PrevId INT,
    @ContractDate DATETIME,
    @PrevContractDate DATETIME,
    @CInfo VARCHAR(100)

DECLARE appcursor CURSOR FAST_FORWARD FOR 
SELECT Id, ContractDate, CInfo FROM App  

OPEN appcursor  

FETCH NEXT FROM appcursor   
INTO @Id, @ContractDate, @CInfo

WHILE @@FETCH_STATUS = 0  
BEGIN  

    IF(@Id != @PrevId)
        SET @PrevContractDate = NULL

    INSERT @AppReply (Id, ContractDate, CInfo)
    SELECT TOP 1 @Id, @ContractDate, @CInfo

    UPDATE @AppReply SET ReplyDate = R.ReplyDate, RInfo = R.RInfo
    FROM @AppReply AR
    LEFT JOIN Reply R ON R.Id = AR.Id
        AND R.ReplyDate > AR.ContractDate AND R.ReplyDate < ISNULL(@PrevContractDate, DATEADD(DD, 1, R.ReplyDate))
    WHERE AR.Id = @Id AND AR.CInfo = @CInfo

    SET @PrevContractDate = @ContractDate
    SET @PrevId = @Id

    FETCH NEXT FROM appcursor   
    INTO @Id, @ContractDate, @CInfo 
END   
CLOSE appcursor;  
DEALLOCATE appcursor;  

SELECT * FROM @AppReply

Hope this helps. P.S.: this query was written fast. Sorry for poor logic in it.

Upvotes: 0

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522762

After about 15 minutes of psychological torment I was able to churn out this query. The only caviat is that I am not sure that the join condition will only match a single record in each case. I suspect there is another join condition which you did not explicitly mention.

SELECT t1.IDApplication, t1.ContactDate, t1.CInfo,
    t2.ReplyDate, t2.RInfo
FROM table1 t1
LEFT JOIN
table2 t2
    ON t1.IDApplication = t2.IDApplication AND
       t2.ReplyDate > t1.ContactDate AND
       t2.ReplyDate < (SELECT MIN(t.ContactDate)
                       FROM table1 t
                       WHERE t.ContactDate > t1.ContactDate AND
                             t.IDApplication = t1.IDApplication)

Upvotes: 1

Related Questions