Reputation: 43
I am looking to figure out how many customers get their product from a certain store. The problem each prod_id can have up to 12 weeks of data for each customer. I have tried a multitude of codes, some add up all of the obersvations for each customer while others like the one below remove all but the last observation.
proc sort data= have; BY Prod_ID cust; run;
Data want;
Set have;
by Prod_Id cust;
if (last.Prod_Id and last.cust);
count= +1;
run;
data have
prod_id cust week store
1 A 7/29 ABC
1 A 8/5 ABC
1 A 8/12 ABC
1 A 8/19 ABC
1 B 7/29 ABC
1 B 8/5 ABC
1 B 8/12 ABC
1 B 8/19 ABC
1 B 8/26 ABC
1 C 7/29 XYZ
1 C 8/5 XYZ
1 F 7/29 XYZ
1 F 8/5 XYZ
2 A 7/29 ABC
2 A 8/5 ABC
2 A 8/12 ABC
2 A 8/19 ABC
2 C 7/29 EFG
2 C 8/5 EFG
2 C 8/12 EFG
2 C 8/19 EFG
2 C 8/26 EFG
what i want it to look like
prod_id store count
1 ABC 2
1 XYZ 2
2 ABC 1
2 EFG 2
Upvotes: 0
Views: 2601
Reputation: 1770
Firstly, read about if-statement. I've just edited your code to make it work:
proc sort data=have;
by prod_id store cust;
run;
data want(drop=cust week);
set have;
retain count;
by prod_id store cust;
if (last.cust) then count=count+1;
else if (first.prod_id or first.store) then count = 0;
if (last.prod_id or last.store) then output;
run;
If you will have questions, ask.
Upvotes: 2
Reputation: 51566
The only place where the result of the COUNT() aggregate function in SQL might be confusing is that it will not count missing values of the variable.
select prod_id
, store
, count(distinct cust) as count
, count(distinct cust)+max(missing(cust)) as count_plus_missing
from have
group by prod_id ,store
;
Upvotes: 2