Reputation: 23

SQL Server Select query to exclude similar rows based on the length of one of the columns

I am working on a table that has data similar to:

key1, key2, col1, col2, col3
1,    2,    1,    10,   'bla'
2,    2,    1,    10,   'bla2'
2,    1,    2,    10,   'bla'

keys 1 and 2 are a composite key.

I would like a query that returns:

key1, key2, col1, col2, col3
2,    2,    1,    10,   'bla2'
2,    1,    2,    10,   'bla'

So no two rows have the same value for both col1 and col2. And the col3 value returned is the one with the longest length.

I think this must be simple - but I cant workout the solution.

Many thanks, Paul

Upvotes: 0

Answers (3)

M.Ali

Reputation: 69554

Test Data

DECLARE  @TABLE_NAME TABLE(key1 INT, key2 INT, col1 INT, col2 INT, col3 VARCHAR(10))

INSERT INTO @TABLE_NAME 
SELECT 1,    2,    1,    10,   'bla'
UNION ALL 
SELECT 2,    2,    1,    10,   'bla2'
UNION ALL
SELECT 2,    1,    2,    10,   'bla'

Query

SELECT * FROM
(
SELECT *, ROW_NUMBER() OVER (PARTITION BY col1,col2 ORDER BY LEN(col3) DESC) AS rn
FROM @TABLE_NAME 
)Q
WHERE rn = 1

Result Set

╔══════╦══════╦══════╦══════╦══════╦════╗
║ key1 ║ key2 ║ col1 ║ col2 ║ col3 ║ rn ║
╠══════╬══════╬══════╬══════╬══════╬════╣
║    2 ║    2 ║    1 ║   10 ║ bla2 ║  1 ║
║    2 ║    1 ║    2 ║   10 ║ bla  ║  1 ║
╚══════╩══════╩══════╩══════╩══════╩════╝

`Working SQL FIDDLE`

Upvotes: 1

Dave Mason

Reputation: 4936

This works based on the small set of sample data. However, the result set will have some duplicates if there are multiple rows with the same Co1/Col2 values and Col3 values that are all the same length. For instance:

2, 2, 1, 10, 'bla2'

2, 2, 1, 10, 'bla4'

2, 2, 1, 10, 'bla6'

--A common table expression that represents the sample data from the OP.
WITH cteSampleData AS
(
    SELECT 1 Key1,    2 Key2,    1 Col1,    10 Col2,   'bla' Col3 UNION 
    SELECT 2,    2,    1,    10,   'bla2' UNION 
    SELECT 2,    1,    2,    10,   'bla'
),
--Another common table expression.  It returns unique combinations of col1 and col2 with the col3 value with the longest length.
cte2 AS
(
    SELECT Col1, Col2, MAX(LEN(Col3)) LongestCol3Val
    FROM cteSampleData
    GROUP BY Col1, Col2
)
SELECT c.*
FROM cteSampleData c
JOIN cte2
    ON cte2.Col1 = c.Col1
    AND cte2.Col2 = c.Col2
    AND cte2.LongestCol3Val = LEN(c.Col3)

Upvotes: 0

Mureinik

Reputation: 311853

The analytic row_number() function fits the bill perfectly:

SELECT key1, key2, col1, col2, col3
FROM   (SELECT key1, key2, col1, col2, col3,
               ROW_NUMBER() OVER 
               (PARTITION BY key1, key2 ORDER BY LEN(col3) DESC) AS rn
        FROM   my_table) t
WHERE  rn = 1

Upvotes: 0

SQL Server Select query to exclude similar rows based on the length of one of the columns

Answers (3)

Working SQL FIDDLE

Related Questions

`Working SQL FIDDLE`