Reputation: 379
I am an R noob :) and this is my first post.
I have a dataset of 4k entries (data
) describing mortality rates (data$mortality
) by US state (data$state
).
I want to loop through the mortality rates by state name
for instance loop through all mortality rates in "AK"
something like this:
tbl <- table (data$State) ## table with frequency for entries at each state
How can I loop through all the occurrences of each state?
I don't want to specify the state name. I want to sort all states then loop through them by name:
"AK"
, "AL"
etc...
for instance, my table would be:
State mortality
AL 14.3
AL 18.5
AL 18.1
AL NA
AL NA
AK NA
AK 17.7
AK 18
AK 15.9
AK NA
AK 19.6
AK 17.3
AZ 15
AZ 17.1
AZ 17.1
AZ NA
AZ 16.4
AZ 15.2
AZ 16.7
I can then loop through all rates in "AL" and rank them then choose a hospital name associated with each ranked mortality rate in "AL" I can write a piece of code for each state at a time but imagine doing that for all states!
Upvotes: 2
Views: 714
Reputation: 66819
Here's a data.table solution, as suggested in a comment:
require(data.table)
DT <- data.table(hospID=1:nrow(data),data)
DT[,r:=rank(mortality,na.last='keep'),by=State]
Then run DT
to see the result:
hospID State mortality r
1: 1 AL 14.3 1.0
2: 2 AL 18.5 3.0
3: 3 AL 18.1 2.0
4: 4 AL NA NA
5: 5 AL NA NA
6: 6 AK NA NA
7: 7 AK 17.7 3.0
8: 8 AK 18.0 4.0
9: 9 AK 15.9 1.0
10: 10 AK NA NA
11: 11 AK 19.6 5.0
12: 12 AK 17.3 2.0
13: 13 AZ 15.0 1.0
14: 14 AZ 17.1 5.5
15: 15 AZ 17.1 5.5
16: 16 AZ NA NA
17: 17 AZ 16.4 3.0
18: 18 AZ 15.2 2.0
Look at ?rank
to see different ways of handling ties and NA
values.
If you want to sort on the rank, you can do that with DT[order(State,r)]
. The data.table package also allows for a key -- a vector of columns on which the data.table is sorted automatically. There are other benefits to setting a key as well that you can read about in a data.table tutorial or the FAQ.
Upvotes: 2
Reputation: 7905
To sort by col 'a':
x = data.frame(a = sample(LETTERS, 10), b = runif(10))
x = x[order(x[, 'a']), ]
print(x)
4 B 0.8030872
9 C 0.3754850
7 D 0.8670409
5 G 0.1278583
3 J 0.9161972
6 N 0.7159080
8 R 0.5340525
2 S 0.2903496
10 T 0.5466612
1 V 0.9187505
Upvotes: 0