Reputation:
I have a dataset with three rows:
Date State Count
1994-01-05 Alabama 408
1994-01-06 Alabama 784
1994-02-08 Alabama 552
1994-01-05 Alaska 1067
1994-01-06 Alaska 36
1994-02-08 Alaska 8571
1994-01-05 Arizona 385
1994-01-06 Arizona 1845
1994-02-08 Arizona 49
where there are counts for the same set of dates for each of the fifty states. The dates and states are ordered as shown above.
I want to get the date into a format with four rows*:
Date State Count mean
1994-01-05 Alabama 408 581.333
1994-01-06 Alabama 784 581.333
1994-02-08 Alabama 552 581.333
1994-01-05 Arizona 385 759.666
1994-01-06 Arizona 1845 759.666
1994-02-08 Arizona 49 759.666
1994-01-05 Alaska 1067 3224.666
1994-01-06 Alaska 36 3224.666
1994-02-08 Alaska 8571 3224.666
where, first, the mean of the counts for each state is computed and inputted into the fourth column. And then, the states are reordered from smallest to largest mean.
I was able to complete the first step of computing the mean for each state, using the command:
plyed = ddply(dataset,.(State), transform, mean= mean(Count))
However, this command only computed the mean for each state, but did not reorder the states by the mean value, giving the below:
Date State Count mean
1994-01-05 Alabama 408 581.333
1994-01-06 Alabama 784 581.333
1994-02-08 Alabama 552 581.333
1994-01-05 Alaska 1067 3224.666
1994-01-06 Alaska 36 3224.666
1994-02-08 Alaska 8571 3224.666
1994-01-05 Arizona 385 759.666
1994-01-06 Arizona 1845 759.666
1994-02-08 Arizona 49 759.666
I am unsure how to now reorder the states by their mean to get my desired output*. I tried the reorder command, but am getting all different and unwanted output formats. Here is one example of a command I tried with no success:
reorder(plyed$State, plyed$mean, order=is.ordered(plyed$State))
Upvotes: 1
Views: 2605
Reputation: 440
Try using the order() function. A good example can be found in the answer to this question How to sort a dataframe by column(s)?
new_df <- plyed[with(plyed, order(mean)),]
Upvotes: 1
Reputation: 18437
You can use plyr::arrange
arrange(ddply(df, .(State), mutate, mean = mean(Count)), mean)
## Date State Count mean
## 1 1994-01-05 Alabama 408 581.33
## 2 1994-01-06 Alabama 784 581.33
## 3 1994-02-08 Alabama 552 581.33
## 4 1994-01-05 Arizona 385 759.67
## 5 1994-01-06 Arizona 1845 759.67
## 6 1994-02-08 Arizona 49 759.67
## 7 1994-01-05 Alaska 1067 3224.67
## 8 1994-01-06 Alaska 36 3224.67
## 9 1994-02-08 Alaska 8571 3224.67
Just for fun I'll add the dplyr
solution
detach(package:plyr)
library(dplyr)
df %.%
group_by(State) %.%
mutate(mean = mean(Count)) %.%
arrange(mean)
Upvotes: 0