Petter Brodin
Petter Brodin

Reputation: 2189

How to get first and last related event without having to do extra subqueries?

I have two tables: Customers and CustomerEvents. Each customer can have 0 or more events.

I'm working on a report that's supposed to look something like this:

CustomerName     | FirstEventDate | FirstEventMessage   | LastEventDate | LastEventText
---------------------------------------------------------------------------------------
Customers 'R' US | 2018-01-01     | Customer registered | 2018-04-06    | Customer Quit

The actual query is going to be a lot larger than that, but this is the general gist of it.

If I only needed the dates, I could easily do it using aggregates:

SELECT 
    c.Name AS ContactName
    , MAX(e.DateTime) AS FirstEventDate
    , '???' AS FirstEventMessage
    , MIN(e.DateTime) AS FirstEventDate
    , '???' AS FirstEventMessage
FROM Contacts c
    LEFT JOIN CustomerEvents e ON e.ContactId = c.Id
GROUP BY c.Name

Unfortunately, that leaves out the corresponding messages.

I've managed to solve it using a subquery for each message, but that gets super expensive, so I was wondering if there's another approach you'd suggest. The full query is more complex than the example will have joins on different kinds of events, each displaying the date and message of the first and last event.

Upvotes: 1

Views: 182

Answers (3)

Gottfried Lesigang
Gottfried Lesigang

Reputation: 67281

As you asked me, to create an answer out of my comment, here you are:

With version 2012 Microsoft introduced some new windowing functions. Usefull for you are FIRST_VALUE() and LAST_VALUE(). Both need an OVER() clause to specify the sort order and - if needed - a partitioning rule.

This should work (but I do not know your tables and data):

SELECT 
    c.Name AS ContactName
    , MIN(e.DateTime) AS FirstEventDate
    , FIRST_VALUE(e.EventMessage) OVER(ORDER BY e.DateTime) AS FirstEventMessage
    , MAX(e.DateTime) AS LastEventDate
    , LAST_VALUE(e.EventMessage) OVER(ORDER BY e.DateTime) AS LastventMessage
FROM Contacts c
    LEFT JOIN CustomerEvents e ON e.ContactId = c.Id
GROUP BY c.Id,c.Name;

But be warned: If your e.DateTime is not unique (per Contact), you will get a random "first" value...

Hints

  • Make sure to have indexes on DateTime and ContactId and
  • add the c.Id to your GROUP BY

An alternative was to replace the LEFT JOIN CustomerEvents with a row-wise executed correlated sub-query. This has the advantage, that you can be sure, that both (and more) values are taken from the same row.

OUTER APPLY (SELECT TOP 1 e.[DateTime],e.[EventMessage]  
             FROM CustomerEvents AS e
             WHERE e.ContactId=c.Id
             ORDER BY e.EventMessage ASC) AS FirstEvent
--same with DESC
OUTER APPLY (SELECT TOP 1 e.[DateTime],e.[EventMessage]  
             FROM CustomerEvents AS e
             WHERE e.ContactId=c.Id
             ORDER BY e.EventMessage DESC) AS LastEvent

Then use the columns in your query like

 FirstEvent.DateTime AS FirstDateTime
,FirstEvent.EventMessage AS FirstMessage
,LastEvent.DateTime AS LastDateTime
,LastEvent.EventMessage AS LastMessage

Upvotes: 2

Ayyappa.k
Ayyappa.k

Reputation: 1

Try Below Approach

SELECT 
    c.Name AS CustomerName
    , MAX(e.DateTime) AS LastEventDate
    , B.Message AS FirstEventMessage
    , MIN(e.DateTime) AS FirstEventDate
    , D.Message AS FirstEventMessage
FROM #Customers c
    LEFT JOIN #CustomerEvents e ON e.CustomerId = c.Id
    LEFT JOIN 
    (
    SELECT A.CustomerId, A.Message
    FROM
        (
        SELECT 
            CustomerId, Message, Row_Number()over(Partition By CustomerId order by DATETIME DESC) as No
        FROM
            #CustomerEvents 
        ) A
    WHERE
        A.No = 1
    ) B ON B.CustomerId = C.Id
    LEFT JOIN 
    (
    SELECT A.CustomerId, A.Message
    FROM
        (
        SELECT 
            CustomerId, Message, Row_Number()over(Partition By CustomerId order by DATETIME) as No
        FROM
            #CustomerEvents 
        ) A
    WHERE
        A.No = 1
    ) d ON d.CustomerId = C.Id
GROUP BY 
    c.Name, B.Message, D.Message

Upvotes: 0

Jatin Patel
Jatin Patel

Reputation: 2104

Try this CTE approach, adjust the columns to SELECT clause according to you need.

;WITH CTE
AS(
    SELECT 
        *, 
        ROW_NUMBER() OVER(PARTITION BY c.Name ORDER BY e.DateTime) AS RowAsc,
        ROW_NUMBER() OVER(PARTITION BY c.Name ORDER BY e.DateTime DESC) AS RowDesc
    FROM Contacts c
        LEFT JOIN CustomerEvents e ON e.ContactId = c.Id    
)
SELECT 
    c1.*
    , c2.*
FROM CTE c1
INNER JOIN CTE c2 ON c1.Name = c2.Name
    AND c1.RowAsc = 1 
    AND c2.RowDesc = 1

Upvotes: 0

Related Questions