Reputation: 95
I cannot count the number of times the variable Bucket (character variable) has a certain attribute.
data Bucket;
set Agreement8;
select;
when (0.0 <= ltv_max_on <= 0.25) Bucket="0-25";
when (0.26 <= ltv_max_on <= 0.50) Bucket="26-50";
when (0.51 <= ltv_max_on <= 0.75) Bucket="51-75";
when (0.76 <= ltv_max_on <= 0.100) Bucket="51-75";
otherwise Bucket=">100";
end;
run;
Then I run:
proc sql;
select count(*) as No_Obs_1
from Summary_Bucket
where Bucket="0-25";
;
quit;
proc sql;
select count(*) as No_Obs_2
from Summary_Bucket
where Bucket="26-50";
quit;
proc sql;
select count(*) as No_Obs_3
from Summary_Bucket
where Bucket="51-75";
;
quit;
proc sql;
select count(*) as No_Obs_4
from Summary_Bucket
where Bucket="76-100";
;
quit;
proc sql;
select count(*) as No_Obs_5
from Summary_Bucket
where Bucket=">100";
;
quit;
This is my result:
But I obviously have more than 1 or zero of these attributes:
Upvotes: 0
Views: 72
Reputation: 1000
A couple of things worth checking:
1) Try specifying the length of bucket before you assign it a value, as the table you show only appears to display 4 characters. E.g try the following:
data Bucket;
set Agreement8;
length Bucket $ 5; /* <- try adding this line */
select;
when (0.0 <= ltv_max_on <= 0.25) Bucket="0-25";
when (0.26 <= ltv_max_on <= 0.50) Bucket="26-50";
when (0.51 <= ltv_max_on <= 0.75) Bucket="51-75";
when (0.76 <= ltv_max_on <= 0.100) Bucket="51-75";
otherwise Bucket=">100";
end;
run;
2) Further, you create a table called Bucket
, but your SQL refers to table Summary_Bucket
, which does not appear consistent.
Upvotes: 2
Reputation: 3315
The problem is your values are getting truncated in bucket, I would just use proc format, so that everything is simple and can be easily as shown shown
data have;
input n;
datalines;
0
10
25
35
45
55
75
85
95
106
;
proc format;
value newval
0 - 25 = '0-25'
26 - 50 = '26-50'
51 - 75 = '51-75'
76 - high = ">100"
;
proc sql;
select put(n,newval.) as ranges, count(*)
from have
group by 1;
Upvotes: 1
Reputation: 1269803
You can just do the entire calculation using proc sql
:
proc sql;
select (case when ltv_max_on <= 0.25 then '0-25'
when ltv_max_on <= 0.50 then '26-50'
when ltv_max_on <= 0.75 then '51-75'
when ltv_max_on <= 1.00 then '51-75'
else '>100'
end) as bucket,
count(*)
from Agreement8
group by (case when ltv_max_on <= 0.25 then '0-25'
when ltv_max_on <= 0.50 then '26-50'
when ltv_max_on <= 0.75 then '51-75'
when ltv_max_on <= 1.00 then '51-75'
else '>100'
end);
run;
There is no need for multiple steps. That is one of the advantages of SQL.
Upvotes: 2