Reputation: 1759
I have a dataset and I am reading it with a by group statement:
data TEMPDATA;
SET RAWDATA; by SYMBOL DATE;
run;
proc expand data=TEMPDATA out=GAPDATA to=day method=step;
by symbol date;
id time;
run;
However, I realized that the proc expand procedure would return an error if there is a record in the by group that has only one observation.
For example:
| Symbol | Date | Time | BB | BO | MIDPRICE |
|--------|----------|------|----|----|----------|
| AAPL | 20130102 | 2 | 2 | 3 | 2.5 |
If there is only one record of AAPL, SAS will refuse to execute the command.
Therefore, I was wondering if there is a way to drop all the records, with the same symbol, that has only one record in the by group (symbol, date)?
Upvotes: 1
Views: 583
Reputation: 51566
Since you are using a data step already just add logic to delete the singletons. Any record that is both the first and last in its group indicates there is only one record in that group.
data TEMPDATA;
SET RAWDATA;
by SYMBOL DATE;
if first.date and last.date then delete;
run;
Upvotes: 4
Reputation: 1757
One nice feature in SAS PROC SQL
is that you can group by and add summary measures while retaining all detail. This makes such removal easy (and can be useful in many other contexts as well). I.e.
PROC SQL;
CREATE TABLE tempdata2 AS
SELECT *
FROM tempdata
GROUP BY symbol, date
HAVING count(*) > 1
;
QUIT;
Upvotes: 1