yek
yek

Reputation: 19

SAS counting observations

I am looking at a data set that looks like

GVKEY DateYear

1 001000 1971

2 001000 1972

3 001000 1973

4 001000 1974

5 001000 1975

6 001000 1976

7 001001 1971

.

.

.

88 001010 1971

89 001010 1972

90 001010 1973

.

.

.

105 001010 1988

106 001010 1989

107 001011 1973

.

.

.

So I would like to only keep the GVKEY's that have no gaps in the DateYear from 1971 to 1989. I guess I'll have to drop duplicates but then how would I count observations for every GVKEY and drop GVKEY's that don't have 19 observations? Is this possible in SAS? Thank you.

Upvotes: 0

Views: 402

Answers (2)

Joe
Joe

Reputation: 63434

The straightforward way to do this in the data step is a double DoW loop. That cycles through the data twice; for any given ID, first it goes through all the records for that ID checking to see if it meets the criteria (in this case, for deletion, though you can do the opposite), and then cycles through a second time and applies the appropriate logic to output or delete.

data want;
  do _n_ = 1 by 1 until (last.id);
    set have;
    by id;
    if [logic] then delete_flag=1;
  end;
  do _n_ = 1 by 1 until (last.id);
    set have;
    by id;
    if delete_flag ne 1 then output;
  end;
run;

You just need to determine what your logic is for setting that delete flag. In your case, it's probably some combination of using if first.id and year ne 1971, if last.id and year ne 1989, and if dif(year) gt 1, but your question isn't sufficiently clear to make it obvious exactly what you'd do here - but hopefully you can figure that out, or update with more details.

Upvotes: 1

Richard
Richard

Reputation: 27516

In Proc SQL you can count distinct values within a group to identify the keys having 19 distinct years.

Example:

proc sql;
  create table want as
  select distinct key, year
  from have
  where year between 1971 and 1989
  group by key
  having count(distinct year) = 19
  order by key, year
  ;

Upvotes: 1

Related Questions