Reputation: 165
I want to filter for all People who have the same AttributValue for certain Attributs as another Person I have the following Query:
SELECT
p1.keyValue,
p1.Displayname,
p2.keyValue,
p2.Displayname,
p1.ImportantAttrName,
p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
JOIN Person p2 WITH (NOLOCK)
ON p1.ImportantAttr = p2.ImportantAttr
WHERE p1.keyValue != p2.keyValue
AND p1.ImportantAttrValue = p2.ImportantAttrValue
with this query I will get all entries twice, because every Person will be in p1 and p2. So the result will look like this:
I123 Freddy Krüger A123 The Horsemen Moviecategorie Horror
A123 The Horsemen I123 Freddy Krüger Moviecategorie Horror
But for analysis purposes it would be be nice if I could get a combination of p1.keyvalue and p2.keyvalue only once, without respect to in which of both colums the values are.
So far I did this by exporting to excel and do the cleanup there, but is there a way to fix the query to not get this "duplicates"?
Upvotes: 0
Views: 104
Reputation: 294
this may be different way of approach but can be get the expected.
Using Partition Count(*) :
select count(*) over(partition by Attr) as RepeatCount, * from (
select keyValue,DisplayName,ImportantAttr + ' ' +ImportantAttrValue as Attr
from tblTest) tblTemp
as per the above Query you will get the result like below
> RepeatCount keyValue DisplayName Attr
>
> 1 P321 The Ironman Generalcategorie Test
> 2 I123 Freddy Krüger Moviecategorie Horror
> 2 A123 The Horsemen Moviecategorie Horror
from this result you can filter records by Repeatcount > 1
Upvotes: 0
Reputation: 72225
You can turn:
on p1.ImportantAttr = p2.ImportantAttr
to:
on p1.ImportantAttr = p2.ImportantAttr and p1.keyValue < p2.keyValue
The whole query could look like this:
SELECT
p1.keyValue,
p1.Displayname,
p2.keyValue,
p2.Displayname,
p1.ImportantAttrName,
p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
JOIN Person p2 WITH (NOLOCK)
ON p1.ImportantAttr = p2.ImportantAttr
AND p1.keyValue < p2.keyValue
WHERE p1.ImportantAttrValue = p2.ImportantAttrValue
Upvotes: 2
Reputation: 522762
Use where p1.keyValue < p2.keyValue
:
SELECT
p1.keyValue,
p1.Displayname,
p2.keyValue,
p2.Displayname,
p1.ImportantAttrName,
p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
INNER JOIN Person p2 WITH (NOLOCK)
ON p1.ImportantAttr = p2.ImportantAttr
WHERE
p1.keyValue < p2.keyValue AND -- change is here
p1.ImportantAttrValue = p2.ImportantAttrValue;
This will ensure that you do not see duplicate pairs. To understand numerically why this works, consider two key values, 1
and 2
. Using the condition !=
, both 1-2
and 2-1
meet that criteria. But using <
results in only 1-2
.
Upvotes: 3