Reputation: 981
I have a data set, df
, that looks like this:
ZIP TEENS ADULTS SENIORS TOTAL
054216 . 2000 . 2000
02216 45 105 10 160
01720 0 256 0 256
02113 . 4122 918 5040
02144 782 20 0 1002
I would like to exclude any zip codes that are all adults. In addition, I would like to only keep rows where the adult population is greater than 50%. My code, listed below, keeps rows 1 and 3 even though there are no teens or seniors in those communities. Any tips on what's wrong with it would be appreciated.
data adult_zips;
set df;
where ((adults/total) > 0.50) and
((teens is not missing) or (teens ne 0)) and
((seniors is not missing) or (seniors ne 0));
run;
Upvotes: 0
Views: 61
Reputation: 79
data adult_zips;
set df;
where adults / total > 0.50
and (teens > 0 or seniors > 0);
run;
This is the easiest solution.
Upvotes: 1
Reputation: 94914
You confused AND and OR:
data adult_zips;
set df;
where adults / total > 0.50
and
(
(teens is not missing and teens ne 0)
or
(seniors is not missing and seniors ne 0)
);
run;
Or simpler:
data adult_zips;
set df;
where adults / total > 0.50
and (teens > 0 or seniors > 0);
run;
Or even:
data adult_zips;
set df;
where adults / total > 0.50
and adults ne total;
run;
Upvotes: 1