Alice Wang
Alice Wang

Reputation: 29

Excluding all records of the same ID if any record with that ID meets two conditions

I have a table that looks like this:

ID    Pet
01    Dog
01    Cat
01    Parrot
01    Guinea Pig
02    Dog
02    Guinea Pig
03    Parrot

If an ID has a dog and a cat, then I want to exclude ALL records for that ID, regardless of what other animals they have. So my output table would look like:

ID    Pet
02    Dog
02    Guinea Pig
03    Parrot

This is the query that I've tried:

PROC SQL;
CREAT TABLE new_table AS
SELECT * from Pets a
WHERE NOT EXISTS (SELECT *
                  FROM Pets b
                  WHERE b.ID = a.ID
                  AND b.ID = "Dog"
                  AND b.ID = "Guinea Cat"));
RUN;

This doesn't seem to be working and isn't actually filtering anything out.

Upvotes: 0

Views: 328

Answers (2)

butitruns
butitruns

Reputation: 28

This is what you want:

SELECT
  * 
FROM
  pets 
WHERE
  id NOT IN
  (
    SELECT
      a.id 
    FROM
      pets a 
      JOIN
        pets b 
        ON a.id = b.id 
        AND a.pet = "Dog" 
        AND b.pet = "Cat"
  )

The inner SQL query is finding ids where there is both a row with pet = "Dog" and a row with pet = "Cat". The outer SQL is filtering out those ids.

Upvotes: 1

Tom
Tom

Reputation: 51566

That is easy enough in PROC SQL because SAS will automatically remerge summary statistics back onto the detail rows.

data have;
  input ID $ Pet $20. ;
cards;
01    Dog
01    Cat
01    Parrot
01    Guinea Pig
02    Dog
02    Guinea Pig
03    Parrot
;

proc sql;
create table want as 
  select * 
  from have
  group by id
  having not (max(pet='Dog') and max(pet='Cat')) 
  order by id,pet
;
quit;

Upvotes: 1

Related Questions