SAS DATA: How to remove observations that only occur once

Question

In SAS, suppose I have a dataset named "person_groups". It has two variables, named "person" and "group". This dataset simply assigns each person to a group.

How can I remove from this dataset all persons who have no one else in their group? In other words, how can I remove all singleton groups?

[I'd be happy with a proc sql solution or a data step solution--either is fine.]

Side note: I'm new to SAS. I have been using C++ and MATLAB for many years. I feel like I can't understand how to do anything with the SAS DATA step. It seems extremely clunky, bizarre, and inelegant. Frankly, I'm growing very frustrated. Anyone out there have hope for the weary? :)

Jay Corbett · Accepted Answer

Here's a way that uses a data step. This method requires a sort.

data person_groups;
 input person $ group $;
 datalines;
John Grp1
Mary Grp3
Joe Grp2
Jane Grp3
Frank Grp1
;

Proc Sort data=person_groups;
 by group;
run;

Data person_groups;
 set person_groups;
 by group;
 if first.group and last.group then delete;
run;

SAS DATA: How to remove observations that only occur once

Answers (2)

Related Questions