user3934760
user3934760

Reputation: 95

Problems counting a variable attributes

I cannot count the number of times the variable Bucket (character variable) has a certain attribute.

data Bucket;
set Agreement8;
select;
    when (0.0 <= ltv_max_on <= 0.25) Bucket="0-25";
    when (0.26 <= ltv_max_on <= 0.50) Bucket="26-50";
    when (0.51 <= ltv_max_on <= 0.75) Bucket="51-75";
    when (0.76 <= ltv_max_on <= 0.100) Bucket="51-75";
    otherwise Bucket=">100";
end;
run;

Then I run:

proc sql;
select count(*) as No_Obs_1
from Summary_Bucket
where Bucket="0-25";
;
quit;

proc sql;
select count(*) as No_Obs_2
from Summary_Bucket
where Bucket="26-50";
quit;

proc sql;
select count(*) as No_Obs_3
from Summary_Bucket
where Bucket="51-75";
;
quit;

proc sql;
select count(*) as No_Obs_4
from Summary_Bucket
where Bucket="76-100";
;
quit;

proc sql;
select count(*) as No_Obs_5
from Summary_Bucket
where Bucket=">100";
;
quit;

This is my result:

enter image description here

But I obviously have more than 1 or zero of these attributes:

enter image description here

Upvotes: 0

Views: 72

Answers (3)

Amir
Amir

Reputation: 1000

A couple of things worth checking:

1) Try specifying the length of bucket before you assign it a value, as the table you show only appears to display 4 characters. E.g try the following:

data Bucket;
set Agreement8;
length Bucket $ 5; /* <- try adding this line */
select;
    when (0.0 <= ltv_max_on <= 0.25) Bucket="0-25";
    when (0.26 <= ltv_max_on <= 0.50) Bucket="26-50";
    when (0.51 <= ltv_max_on <= 0.75) Bucket="51-75";
    when (0.76 <= ltv_max_on <= 0.100) Bucket="51-75";
    otherwise Bucket=">100";
end;
run;

2) Further, you create a table called Bucket, but your SQL refers to table Summary_Bucket, which does not appear consistent.

Upvotes: 2

Kiran
Kiran

Reputation: 3315

The problem is your values are getting truncated in bucket, I would just use proc format, so that everything is simple and can be easily as shown shown

data have;
input n;
datalines;
0
10
25
35
45
55
75
85
95
106
;


proc format;
value newval
0 -  25 = '0-25'
26 - 50 = '26-50'
51 -  75 = '51-75'
76 - high = ">100"
;

proc sql;
select put(n,newval.) as ranges, count(*)
from have
 group by  1;

Upvotes: 1

Gordon Linoff
Gordon Linoff

Reputation: 1269803

You can just do the entire calculation using proc sql:

proc sql;
    select (case when ltv_max_on <= 0.25 then '0-25'
                 when ltv_max_on <= 0.50 then '26-50'
                 when ltv_max_on <= 0.75 then '51-75'
                 when ltv_max_on <= 1.00 then '51-75'
                 else '>100'
            end) as bucket,
           count(*)
    from Agreement8
    group by (case when ltv_max_on <= 0.25 then '0-25'
                   when ltv_max_on <= 0.50 then '26-50'
                   when ltv_max_on <= 0.75 then '51-75'
                   when ltv_max_on <= 1.00 then '51-75'
                   else '>100'
              end);

run;

There is no need for multiple steps. That is one of the advantages of SQL.

Upvotes: 2

Related Questions