Aeglasin
Aeglasin

Reputation: 165

Filter for combination of column values in SQL

I want to filter for all People who have the same AttributValue for certain Attributs as another Person I have the following Query:

SELECT
  p1.keyValue,
  p1.Displayname,
  p2.keyValue,
  p2.Displayname,
  p1.ImportantAttrName,
  p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
JOIN Person p2 WITH (NOLOCK)
  ON p1.ImportantAttr = p2.ImportantAttr
WHERE p1.keyValue != p2.keyValue
AND p1.ImportantAttrValue = p2.ImportantAttrValue

with this query I will get all entries twice, because every Person will be in p1 and p2. So the result will look like this:

I123    Freddy Krüger   A123    The Horsemen   Moviecategorie    Horror
A123    The Horsemen    I123    Freddy Krüger   Moviecategorie    Horror

But for analysis purposes it would be be nice if I could get a combination of p1.keyvalue and p2.keyvalue only once, without respect to in which of both colums the values are.

So far I did this by exporting to excel and do the cleanup there, but is there a way to fix the query to not get this "duplicates"?

Upvotes: 0

Views: 104

Answers (3)

Siva Rm K
Siva Rm K

Reputation: 294

this may be different way of approach but can be get the expected.

Using Partition Count(*) :

select count(*) over(partition by Attr) as RepeatCount, * from (
select keyValue,DisplayName,ImportantAttr + ' ' +ImportantAttrValue as Attr
  from tblTest) tblTemp

as per the above Query you will get the result like below

> RepeatCount    keyValue     DisplayName          Attr
> 
> 1       P321        The Ironman          Generalcategorie Test 
> 2       I123        Freddy Krüger        Moviecategorie Horror 
> 2       A123        The Horsemen         Moviecategorie Horror

from this result you can filter records by Repeatcount > 1

Upvotes: 0

Giorgos Betsos
Giorgos Betsos

Reputation: 72225

You can turn:

on p1.ImportantAttr = p2.ImportantAttr

to:

on p1.ImportantAttr = p2.ImportantAttr and p1.keyValue < p2.keyValue

The whole query could look like this:

SELECT
  p1.keyValue,
  p1.Displayname,
  p2.keyValue,
  p2.Displayname,
  p1.ImportantAttrName,
  p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
JOIN Person p2 WITH (NOLOCK)
  ON p1.ImportantAttr = p2.ImportantAttr
  AND p1.keyValue < p2.keyValue
WHERE p1.ImportantAttrValue = p2.ImportantAttrValue

Upvotes: 2

Tim Biegeleisen
Tim Biegeleisen

Reputation: 522762

Use where p1.keyValue < p2.keyValue:

SELECT
    p1.keyValue,
    p1.Displayname,
    p2.keyValue,
    p2.Displayname, 
    p1.ImportantAttrName,
    p1.ImportantAttrValue
FROM Person p1 WITH (NOLOCK)
INNER JOIN Person p2 WITH (NOLOCK)
    ON p1.ImportantAttr = p2.ImportantAttr
WHERE
    p1.keyValue < p2.keyValue AND       -- change is here
    p1.ImportantAttrValue = p2.ImportantAttrValue;

This will ensure that you do not see duplicate pairs. To understand numerically why this works, consider two key values, 1 and 2. Using the condition !=, both 1-2 and 2-1 meet that criteria. But using < results in only 1-2.

Upvotes: 3

Related Questions