Phteven
Phteven

Reputation: 139

Filter based on results where ROW_NUMBER goes higher than 1

I've managed to get some ordered results using ROW_NUMBER() and partitioned by the Country code and client ref, ordered by the date of the quote. The ROW_NUMBER() column is called PRank.

CountryCode  ¦   ClientRef   ¦   Quote Ref   ¦   Name     ¦    Address     ¦   PostCode   ¦   PreferencesCode   ¦   QuoteDate         ¦   PolicyType   ¦   PRank   ¦
     0           AABB001         AABB001MB1    Mr A Smith   1 Country Lane     WA21 2PU       5934611             2017-03-18 00:00:00     MB               1
     0           AABB001         AABBMM1MB2    Mr A Smith   1 Country Lane     WA21 2PU       5934611             2017-03-18 00:00:00     MB               2
     0           AABB001         AABBMM1MB3    Mr A Smith   1 Country Lane     WA21 2PU       5934611             2017-03-18 00:00:00     MB               3
     0           BBGG003         BBGG003MB1    Mrs B Jones  2 City Road        M1 3XY         312191              2017-09-02 00:00:00     MB               1

This is the code for my ROW_NUMBER() column:

ROW_NUMBER() OVER(PARTITION BY CountryCode, ClientRef, QuoteDate 
                  ORDER BY QuoteDate DESC, PolicyRef asc)

There are a lot of results like the last row; they only count to 1 because the customer just had one quote on the date in question. I want to eliminate those rows without removing any rows from the customer results where multiple quotes occured on the same date (i.e. I still want to see all three rows of the first customer in the results above) so I figured by using a CTE and filtering on the ROW_NUMBER would cause me to lose data from the results that I still need to see.

I was trying to do something with HAVING and summing the PRank column but I couldn't figure out the grouping as I don't want to actually aggregate the data, just get rid of any rows where a ClientRef only has one QuoteRef per day.

Upvotes: 0

Views: 1324

Answers (2)

EzLo
EzLo

Reputation: 14189

I think you want something like the following.

;WITH Rankings AS
(
    -- Your query here
    SELECT
        CountryCode = 1,
        ClientRef = 1,
        QuoteDate = 1,
        PRank = 1
),
Only1Record AS
(
    SELECT
        R.CountryCode, 
        R.ClientRef, 
        R.QuoteDate
    FROM
        Rankings AS R
    GROUP BY
        R.CountryCode, 
        R.ClientRef, 
        R.QuoteDate
    HAVING
        MAX(R.PRank) = 1
)
DELETE D FROM
    Only1Record AS R
    INNER JOIN Rankings AS D ON
        R.CountryCode = D.CountryCode AND
        R.ClientRef = D.ClientRef AND
        R.QuoteDate = D.QuoteDate

Another more simple variation is using a windowed COUNT().

;WITH Rankings AS
(
    -- Your query here
    SELECT
        CountryCode = 1,
        ClientRef = 1,
        QuoteDate = 1,
        PRank = 1,
        Total = COUNT(1) OVER (PARTITION BY CountryCode, ClientRef, QuoteDate)
)
DELETE D FROM
    Rankings AS D
WHERE
    D.Total = 1

Upvotes: 1

Thom A
Thom A

Reputation: 95588

If you're looking for customers with more than 1 quote in a day the a CTE would still be the best way to go.

Assuming this is against the ic_dapolicy view in InfocentrePlus (this is obviously OpenGI data, and the icp tables aren't as easy to use) then you're query would be something like:

WITH Quotes AS(
    SELECT *,
           COUNT(*) OVER (PARTITION BY PPY.B@, PPY.Ref@) AS ClientQuotes 
    FROM ic_Dapolicy PPY
    WHERE PPY.Quote_date = @Quote_date) --If you aren't filtering on Quote_date in your where, add it to the PARTITION BY clause
SELECT [YourColumns]
FROM Quotes
WHERE ClientQuotes > 1;

Upvotes: 2

Related Questions