Reputation: 20169
I have data in the following format:
DATE DATA1 DATA2
-------------------------------------------------
20121010 ABC DEF
20121010 DEF ABC
20121010 HIJ KLM
20121010 KLM HIJ
20121212 ABC DEF
20121212 DEF ABC
20121212 HIJ KLM
20121212 KLM HIJ
What I want to do is select rows 1 and 3. I don't care about rows 2 and 4 because they are essentially "duplicates" in my eyes.
Seems simple but I'm just trying to put the query together to accomplish this.
Upvotes: 2
Views: 158
Reputation: 115530
SELECT Date, Data1, Data2
FROM tableX
WHERE Data1 <= Data2
UNION
SELECT Date, Data2, Data1
FROM tableX
WHERE Data2 < Data1 ;
Upvotes: 1
Reputation: 16904
;WITH cte AS
(
SELECT DATE, DATA1, DATA2, ROW_NUMBER() OVER(ORDER BY DATE, DATA1) AS Id
FROM dbo.test70 t
)
SELECT c.DATE, c.DATA1, c.DATA2
FROM cte c LEFT JOIN cte ct ON c.Id = ct.Id + 1 AND c.DATA1 = ct.DATA2
WHERE ct.DATE IS NULL
Demo on SQLFiddle
Upvotes: 0
Reputation: 27427
You can try this:
;WITH CTE AS
(
SELECT date, data1, data2,
CASE WHEN DATA2 > DATA1 THEN data1 + data2
ELSE data2 + data1 END d
FROM T
),CTE2 AS
(
SELECT *,
ROW_NUMBER() OVER (PARTITION BY DATE, d
ORDER BY data1) rn
FROM CTE
)
SELECT date, data1, data2 FROM CTE2 WHERE rn = 1
Upvotes: 0
Reputation: 35557
I like it like this: but I believe it won't work if you're using pre SQL-Server 2008R2
;WITH myCTE
AS
(
SELECT
[Rw] = ROW_NUMBER() OVER (ORDER BY [Date]),
[Date],
[DATA1],
[DATA2]
FROM #data
)
SELECT *
FROM myCTE x
WHERE NOT EXISTS
(
SELECT 1
FROM myCTE y
WHERE
x.DATA1 = y.DATA2 AND
x.Rw > y.Rw AND
x.Date = y.Date
)
Here's a live illustration on SQL Fiddle
Upvotes: 0
Reputation: 1269773
You can use the row_number()
function for this, assuming you are using version 2005 or higher:
select date, data1, data2
from (select t.*,
row_number() over (partition by date order by date) as seqnum
from t
) t
where seqnum = 1
The expression order by date
should produce an arbitrary ordering in any database that supports row_number
. In SQL Server, you can also use order by (select NULL)
.
Or, I realize that your question may be about eliminate duplicates, regardless of order. For that, you can do:
select distinct date, minData, maxData
from (select t.date,
(case when data1 > data2 then data1 else data2 end) as minData,
(case when data1 > data2 then data2 else data1 end) as maxData
from t
) t
This might, however, rearrange the two values, when only one row appears.
The more complicated solution to maintain the original ordering of the columns and eliminate the additional rows combines the two approaches:
select date, data1, data2
from (select t.*,
row_number() over (partition by date order by minData, maxData) as seqnum
from (select t.*
(case when data1 > data2 then data1 else data2 end) as minData,
(case when data1 > data2 then data2 else data1 end) as maxData
from t
) t
) t
where seqnum = 1
Upvotes: 1