Reputation: 313
I have a dataset called a, I want to delete the rows that column with the name "AVG_OPT_PR" = '.' And the following code works:
DATA work.a1;
SET work.a;
IF AVG_OPT_PR = '.' THEN delete;
RUN;
However, there is another column called "Period", and I just want to delete the rows that within one Location, all "AVG_OPT_PR" = '.'. How can I do that?
For example, data looks like:
Location_id | Period | AVG_OPT_PR
------------+-------------+-----------
L_1 | 2 weeks | 105
L_1 | 1 month | .
... | ... | ...
L_1 | 1 year | 103
L_2 | 2 weeks | .
L_2 | 1 month | .
... | ... | ...
L_2 | 1 year | .
Hence, rows that location id equals "L_2" are deleted, but all data for "L_1" are kept.
Thank you!
Upvotes: 0
Views: 148
Reputation: 347
If it's as simple as wanting to only delete rows with both an AVG_OPT_PR
of .
and a Location_id
of L-2
, then this should work just fine.
DATA work.a1;
SET work.a;
IF AVG_OPT_PR = '.' and Location_id = 'L_2' THEN delete;
RUN;
And some additional examples to expand on how you could handle this with the IF THEN DELETE
To delete all rows with AVG_OPT_PR
of .
and where its Location_id
does not equal L_1
(so everything but L_1
)
DATA work.a1;
SET work.a;
IF AVG_OPT_PR = '.' and Location_id ^= 'L_1' THEN delete;
RUN;
And to delete all rows with AVG_OPT_PR
of .
and where its Location_id
is one of those listed IN ('L_2','L_3','L_4',...)
DATA work.a1;
SET work.a;
IF AVG_OPT_PR = '.' and Location_id IN ('L_2','L_3','L_4') THEN delete;
RUN;
Upvotes: 0
Reputation: 1120
Select the maximum value of avg_opt_pr
for each location_id
(if the max is missing, it will come out as .), merge it back onto the first dataset, then drop all the location_ids
that have a max of missing.
proc sql;
create table a2 as select distinct
location_id, max(avg_opt_pr) as avg_opt_pr_max
from a1
group by location_id;
quit;
proc sql;
create table a3 as select
a.*, b.avg_opt_pr_max
from a1 as a
left join a2 as b
on a.location_id = b.location_id;
delete from a3 where missing(avg_opt_pr_max);
quit;
Upvotes: 1