Reputation: 25
I have a large data set which has all states included, currently looking at NE. I am looking at each state individually and I want to count the number of rows that are included in each state. I am currently using:
record_num = _N_;
BUT This gives me the row count of the first row which NE is included. Example, row 49752 is the first record_num:
49752
49753
49754
49756
49757
49758
I want:
1
2
3
4
5
6
I want this to be equal to 1 instead. I could do:
record_num=_N_;
num = record_num-49751;
But I don't want to alter this code every time I change the state which would change the record_num.
Upvotes: 1
Views: 10437
Reputation: 63424
SAS by-group processing is what you need to be using.
For example:
data want;
set have;
by state;
if first.state then counter=0;
counter+1;
run;
What that does is when STATE changes, the last record before the change has a variable LAST.STATE
set to 1 (otherwise has a 0), and the next record (The first for the new value of state) has the variable FIRST.STATE
set to 1 (otherwise 0).
So we set counter to 0 each time a new state is encountered, and increment it (var+1;
increments the variable and retains it across rows).
If you want to only have the actual count, as vasja notes in comments, then add if last.state;
before the run (which will only give you one row per state, the last one).
Upvotes: 1
Reputation: 8513
Some more alternatives to @Joe's answer. You can use proc means
to calculate the number of rows for each state. I've used the sashelp.zipcode dataset as an example. The nway
option tells it to summarize by all of the variables in the class
statement. The n=
option on the output statement tells it just to calculate the number of observations for the class variables (you could also choose other aggregate functions such as mean, sum, etc.).
proc means data=sashelp.zipcode nway missing noprint;
class statecode;
output out=want(keep=statecode _freq_) n=;
quit;
There's also the SQL approach which is my personal favorite as it's a widely understood syntax that can be used in multiple programming languages:
proc sql noprint;
create table want as
select statecode,
count(*) as num_records
from sashelp.zipcode
group by 1
order by 1
;
quit;
Upvotes: 2