fosbie
fosbie

Reputation: 1014

Select only the rows where a value changes

I'm using SQL Server 2008 R2 and I'm struggling to write a query that returns all the rows where a column changes.

In the table below I want to go through all the records in date order and only select the rows where the amount is different from the previous one for that customer.

CustomerId InvoiceId DateInvoice Amount
209 9725772 2020-12-10 9.50
209 9725773 2021-01-15 1.50
209 9725774 2021-01-17 2.50
209 9725775 2021-01-19 3.50
209 9725776 2021-01-21 3.50 *
209 9725777 2021-01-23 9.50
209 9725778 2021-01-25 9.50 *
209 9725779 2021-01-25 3.50
210 9726132 2021-02-02 3.50
210 9726133 2021-03-02 9.50
210 9726134 2021-04-02 9.50 *

I've added an asterisk to the Amount column for the rows that I don't want to return.

Any suggestions would be gratefully received.

Upvotes: 0

Views: 3352

Answers (3)

Thom A
Thom A

Reputation: 95554

You could use a CTE with ROW_NUMBER and LEFT JOIN to the prior row:

WITH CTE AS(
    SELECT CustomerId,
           InvoiceId,
           DateInvoice,
           Amount,
           ROW_NUMBER () OVER (PARTITION BY CustomerID ORDER BY DateInvoice ASC) AS RN
    FROM dbo.YourTable)
SELECT C1.CustomerId,
       C1.InvoiceId,
       C1.DateInvoice,
       C1.Amount
FROM CTE C1
     LEFT JOIN CTE C2 ON C1.CustomerId = C2.CustomerId
                     AND C1.Amount = C2.Amount
                     AND C1.RN = C2.RN + 1
WHERE C2.CustomerId IS NULL;

Using forpas' sample data: db<>fiddle

But LAG/LEAD would be way easier.

Upvotes: 2

fosbie
fosbie

Reputation: 1014

I've just found a way of doing it, but this looks horrible to me, there must be a more readable way of doing this.

SELECT t.CustomerId,
t.InvoiceId,
t.DateInvoice,
t.Amount,
(SELECT TOP 1 Amount 
 FROM #test t1 
 WHERE t1.CustomerId=t.CustomerId AND t1.DateInvoice<t.DateInvoice 
 ORDER BY DateInvoice DESC) AS PrevAmount
 FROM #test AS t
 WHERE ((SELECT TOP 1 Amount 
    FROM #test t1
    WHERE t1.CustomerId=t.CustomerId AND t1.DateInvoice<t.DateInvoice 
    ORDER BY DateInvoice DESC)) <> Amount 
 Or ((SELECT TOP 1 Amount 
 FROM #test t1 
 WHERE t1.CustomerId=t.CustomerId AND t1.DateInvoice<t.DateInvoice 
 ORDER BY DateInvoice DESC)) Is Null

Upvotes: 0

forpas
forpas

Reputation: 164069

You could use LAG() window function in a later than your version of SQL Server, but without it you can use a correlated subquery in the WHERE clause:

SELECT t1.*
FROM tablename t1
WHERE t1.Amount <> COALESCE(
  (
    SELECT TOP 1 t2.Amount
    FROM tablename t2
    WHERE t2.CustomerId = t1.CustomerId AND t2.DateInvoice < t1.DateInvoice
    ORDER BY t2.DateInvoice DESC
  ), -1)

See the demo.
Results:

CustomerId InvoiceId DateInvoice Amount
209 9725772 2020-12-10 9.50
209 9725773 2021-01-15 1.50
209 9725774 2021-01-17 2.50
209 9725775 2021-01-19 3.50
209 9725777 2021-01-23 9.50
209 9725779 2021-01-25 3.50
210 9726132 2021-02-02 3.50
210 9726133 2021-03-02 9.50

Upvotes: 2

Related Questions