Pure.Krome
Pure.Krome

Reputation: 86987

How to clean this Sql data up?

this is a follow on question to a previously asked question.

I have the following data in a single db table.

Name                LeftId    RightId
------------------------------------------
Cat                     1  
Cat                     1
Dog                     2
Dog                     2
Dog                               3
Dog                               3
Gerbil                  4         5 
Cat                
Bird
Cow                     6
Cow
Cow                               7
Dog                     8         9

Note that some rows have NO data for for LeftId and RightId.

Now, what i wish to do is find two different queries

  1. All rows which have at least 1 Id in one of the two columns AND row with NO data in both of those two Id columns.

eg.

Cat     1
Cow     6 (or 7 .. i'm not worried)
  1. All the rows where LeftId and RightId are NULL grouped by the same name. If another row (with the same name) has a value in the LeftId or RightId, this this name will not be returned.

eg.

Bird

hmm..

EDIT: Reworded the first question properly.

Upvotes: 1

Views: 273

Answers (5)

Shalom Craimer
Shalom Craimer

Reputation: 21459

For the first query, you want rows that answer to both of the following criteria:

  1. The Name in the row appears in the table in the same row in which LeftId and RightId are both NULL.
  2. The Name in the row appears in the table in same row where at at least one of LeftId and RightId is not NULL.

Well, #1 is done by:

SELECT Name FROM Tbl WHERE (LeftId IS NULL) AND (RightId IS NULL)

And #2 is done by:

SELECT Name FROM Tbl WHERE (LeftId IS NOT NULL) OR (RightId IS NOT NULL)

You could intersect them to see which Names appear in both lists:

SELECT Name FROM Tbl WHERE (LeftId IS NULL) AND (RightId IS NULL)
INTERSECT
SELECT Name FROM Tbl WHERE (LeftId IS NOT NULL) OR (RightId IS NOT NULL)

Which returns:

Name
----
Cat
Cow

But you want the LeftId and RightId, and you don't care which, so I guess we'll aggregate on the Name:

SELECT Name, MIN(LeftId) AS LeftId, MIN(RightId) AS RightId 
    FROM Tbl WHERE Tbl.Name IN (
      SELECT Name FROM Tbl WHERE (LeftId IS NULL) AND (RightId IS NULL)
      INTERSECT
      SELECT Name FROM Tbl WHERE (LeftId IS NOT NULL) OR (RightId IS NOT NULL)
    )
GROUP BY Name

Which returns

Name  LeftId  RightId
----  ------  -------
Cat   1
Cow   6       7

lc already suggested using COALESE to turn those two IDs to a single one. So how about this:

SELECT Name, COALESCE(MIN(LeftId),MIN(RightId)) AS Id 
    FROM Tbl WHERE Tbl.Name IN (
      SELECT Name FROM Tbl WHERE (LeftId IS NULL) AND (RightId IS NULL)
      INTERSECT
      SELECT Name FROM Tbl WHERE (LeftId IS NOT NULL) OR (RightId IS NOT NULL)
    )
GROUP BY Name

Which returns:

Name  Id
----  --
Cat   1
Cow   6

For the second query, you want rows that obey the following criteria:

  1. The Name appears only in rows that have no LeftId and RightId

I can't think of a way to do that sort of self-referencing query in SQL in a single set of criteria, so I'll break it down to two criteria. Both must be obeyed to be acceptable:

  1. The Name appears in rows that have no LeftId and RightId
  2. The Name does not appear in rows that have either LeftId or RightId

Doing #1 is simply:

SELECT Name FROM Tbl WHERE (LeftId IS NULL) AND (RightId IS NULL)

But #2 is tricky. Of course doing the opposite of #2 ("all the Name that appear in rows that have either LeftId or RightId) is just like before:

SELECT Name FROM Tbl WHERE (LeftId IS NOT NULL) OR (RightId IS NOT NULL)

Now comes the tricky bit - we want all the rows that obey #1 but don't obey the opposite of #2. This is where EXCEPT is useful:

SELECT Name FROM Tbl WHERE (LeftId IS NULL) AND (RightId IS NULL)
EXCEPT
SELECT Name FROM Tbl WHERE (LeftId IS NOT NULL) OR (RightId IS NOT NULL)

Which returns:

Name
----
Bird

Which is what we wanted!

Upvotes: 2

lc.
lc.

Reputation: 116528

I hope I understand you correctly here.

Query 1:

SELECT t1.Name, COALESCE(MIN(t1.LeftID), MIN(t1.RightID))
FROM Table t1
WHERE EXISTS(SELECT t2.Name
             FROM Table t2
             WHERE t2.Name = t1.Name 
             AND   t2.LeftID IS NULL AND t2.RightID IS NULL)
AND   COALESCE(MIN(t1.LeftID), MIN(t1.RightID)) IS NOT NULL
GROUP BY t1.Name

Query 2:

SELECT t1.Name
FROM Table t1
WHERE NOT EXISTS(SELECT t2.Name
                 FROM Table t2
                 WHERE t2.Name = t1.Name
                 AND   (t2.LeftID IS NOT NULL OR t2.RightID IS NOT NULL))

Upvotes: 0

BobbyShaftoe
BobbyShaftoe

Reputation: 28499

If I understand you correctly then this is fairly trivial:

1:

SELECT * 
FROM your_table 
WHERE (LeftId IS NOT NULL 
AND RightId IS NULL)
OR
(LeftId IS NULL 
AND RightId IS NOT NULL)

2:

SELECT * 
FROM your_table
WHERE 
    NOT EXISTS 
              (SELECT * FROM your_table y1
               WHERE (y1.LeftId IS NOT NULL OR y1.RightId IS NOT NULL)
               AND y1.name = your_table.name)

If this isn't right then perhaps you could clarify.

Edit: updated

Upvotes: 0

Peter McG
Peter McG

Reputation: 19045

Query 1:

SELECT [Name], [LeftID], [RightID]

FROM [TestTable]

WHERE -- "All rows which have at least 1 Id in one of the two columns"
      ([LeftID] IS NOT NULL OR [RightID] IS NOT NULL)
      OR
      -- "Rows with NO data in both of those two Id columns"
      ([LeftID] IS NULL AND [RightID] IS NULL)

Query 2:

SELECT [Name], [LeftID], [RightID]

FROM [TestTable]

WHERE -- "All the rows where LeftId and RightId are NULL
      -- grouped by the same name"
      ([LeftID] IS NULL AND [RightID] IS NULL)
      AND
      -- "If another row (with the same name) has a value
      -- in the LeftId or RightId, this name will not be returned"    
      ([Name] NOT IN (SELECT DISTINCT [Name] FROM [TestTable]
                      WHERE [LeftID] IS NOT NULL
                            OR
                            [RightID] IS NOT NULL))

GROUP BY [Name], [LeftID], [RightID]

Results:

Name                                               LeftID      RightID
-------------------------------------------------- ----------- -----------
Cat                                                1           NULL
Cat                                                1           NULL
Dog                                                2           NULL
Dog                                                2           NULL
Dog                                                NULL        3
Dog                                                NULL        3
Gerbil                                             4           5
Cat                                                NULL        NULL
Bird                                               NULL        NULL
Cow                                                6           NULL
Cow                                                NULL        NULL
Cow                                                NULL        7
Dog                                                8           9

(13 row(s) affected)

Name                                               LeftID      RightID
-------------------------------------------------- ----------- -----------
Bird                                               NULL        NULL

(1 row(s) affected)

Upvotes: 0

Mike Bennett
Mike Bennett

Reputation: 345

Query 1)

SELECT * 
FROM Table 
WHERE (LeftID IS NULL AND RightID IS NOT NULL) 
    OR (LeftID IS NOT NULL AND RightID IS NULL)

Query 2)

SELECT * 
FROM Table 
WHERE LeftID IS NULL AND RightID IS NULL

Upvotes: 0

Related Questions