Jinhua Wang
Jinhua Wang

Reputation: 1759

SAS drop records in by group with only one observation

I have a dataset and I am reading it with a by group statement:

  data TEMPDATA;
    SET RAWDATA; by SYMBOL DATE;
  run;
  proc expand data=TEMPDATA out=GAPDATA to=day method=step;
  by symbol date;
  id time;
  run;

However, I realized that the proc expand procedure would return an error if there is a record in the by group that has only one observation.

For example:

| Symbol | Date     | Time | BB | BO | MIDPRICE |
|--------|----------|------|----|----|----------|
| AAPL   | 20130102 | 2    | 2  | 3  | 2.5      |

If there is only one record of AAPL, SAS will refuse to execute the command.

Therefore, I was wondering if there is a way to drop all the records, with the same symbol, that has only one record in the by group (symbol, date)?

Upvotes: 1

Views: 583

Answers (2)

Tom
Tom

Reputation: 51566

Since you are using a data step already just add logic to delete the singletons. Any record that is both the first and last in its group indicates there is only one record in that group.

data TEMPDATA;
  SET RAWDATA;
  by SYMBOL DATE;
  if first.date and last.date then delete;
run;

Upvotes: 4

cole
cole

Reputation: 1757

One nice feature in SAS PROC SQL is that you can group by and add summary measures while retaining all detail. This makes such removal easy (and can be useful in many other contexts as well). I.e.

PROC SQL;

CREATE TABLE tempdata2 AS
SELECT *
FROM tempdata
GROUP BY symbol, date
HAVING count(*) > 1
;

QUIT;

Upvotes: 1

Related Questions