Michael A
Michael A

Reputation: 4615

How do I drop observations based on their observation (row) number within a by group?

In Stata, I can do this:

bysort group_var: drop if _n > 6

to keep only the first six observations within each group as specified by group_var. How do I do this in SAS?

I tried:

proc sort data=indata out=sorted_data;
    by group_var;
run;

data outdata;
    set sorted_data;
    by group_var;
    if (_n_ > 6) then delete;
run;

but this deletes all but the first six observations in the entire dataset (leaving me with only six observations total).

Upvotes: 3

Views: 4295

Answers (1)

DomPazz
DomPazz

Reputation: 12465

You need to count records in each by group.

data outdata;
   set sorted_data;
   by group_var;
   retain count;

   if first.group_var then
      count = 0;

   count = count + 1;
   if count > 6 then delete;

   drop count;
run;

Upvotes: 6

Related Questions