Reputation: 9499
In SAS, suppose I have a dataset named "person_groups". It has two variables, named "person" and "group". This dataset simply assigns each person to a group.
How can I remove from this dataset all persons who have no one else in their group? In other words, how can I remove all singleton groups?
[I'd be happy with a proc sql
solution or a data step
solution--either is fine.]
Side note: I'm new to SAS. I have been using C++ and MATLAB for many years. I feel like I can't understand how to do anything with the SAS DATA step. It seems extremely clunky, bizarre, and inelegant. Frankly, I'm growing very frustrated. Anyone out there have hope for the weary? :)
Upvotes: 2
Views: 3015
Reputation: 9618
Here is a PROC SQL
solution:
proc sql;
delete from person_groups
where group in (
select group
from person_groups
group by 1
having count(*) = 1
);
quit;
As you can see, PROC SQL
mostly follows standard ANSI syntax, so your existing SQL skills should be quite portable. And hang in there with respect to SAS Data Step programming; the language is extremely rich, full featured, and by no means "inelegant". You may have seen inelegant code, but that's more likely the fault of the programmer. Take a few minutes and read the SAS Concepts manual.
Upvotes: 2
Reputation: 28391
Here's a way that uses a data step. This method requires a sort.
data person_groups;
input person $ group $;
datalines;
John Grp1
Mary Grp3
Joe Grp2
Jane Grp3
Frank Grp1
;
Proc Sort data=person_groups;
by group;
run;
Data person_groups;
set person_groups;
by group;
if first.group and last.group then delete;
run;
Upvotes: 4