Reputation: 15
I have hourly weather collected for hundreds of farms for a period of five weeks before a sampling event. I want to determine the average Air_Temp for the three weeks prior to the sampling event. Currently, my data are out of order. I want to group by each farm (denoted in File), and then have all of the data in ascending order by Date and Hour. In other words, I want each File to be in order. Here is an example of my data (a dataframe called Weather):
File Status Hour Air_Temp Dew_Temp Pressure Wind_Dir
1 results_1_farm-19 1 21 24.1 16.5 NA 190
2 results_1_farm-19 1 22 23.0 16.8 NA 0
3 results_1_farm-19 1 23 19.8 16.4 NA 0
4 results_1_farm-19 1 0 17.4 15.8 NA 0
5 results_1_farm-19 1 1 19.0 17.2 NA 170
Wind_Speed Sky Rain_1 Rain_6 Date
1 2.1 7 NA NA 2013-01-14
2 0.0 4 NA NA 2013-01-14
3 0.0 0 NA NA 2013-01-14
4 0.0 0 NA NA 2013-01-15
5 1.5 0 NA NA 2013-01-15
It looks like it's in order, but when you scroll through you'll see that the dates are out of order.
So, I'm trying to use dplyr to tell R to arrange the data by Date and Time with this:
Weather1<-Weather%>%
group_by(File)%>%
arrange(Date, Hour)
However, it seems like arrange has ignored the group_by function. In some cases I have data for two farms with the same Hour and Date. Instead of putting each farm in order, it has put the df in order of Date and Hour.
Am I misunderstanding what group_by will do? Thank you for any help.
Upvotes: 1
Views: 1432
Reputation: 686
In addition to my comments you can also do the following :
sorted <- Weather %>%
arrange(Date, Hour) %>%
group_by(File)
Upvotes: 0
Reputation: 3379
group_by shouldn't be necessary for this, it's typically used for when you are looking to perform some kind of aggregate on your data. The arrange will sort first by the File, then by the Date within each file, then by the Hour within each Date. This should get you the structure you're looking for.
Weather1 <- Weather%>%
arrange(File, Date, Hour)
Upvotes: 1
Reputation: 17309
I am using ‘0.5.0.9001’ version of dplyr
(pre-release of 0.6.0). The new version will be released soon.
for grouped df, the arrange
will ignore grouping information by default:
## S3 method for class 'grouped_df'
arrange(.data, ..., .by_group = FALSE)
So you would have to manually set .by_group = TRUE
in order to tell arrange
that the df is grouped:
Weather1 <- Weather %>%
group_by(File) %>%
arrange(Date, Hour, .by_group = TRUE)
Upvotes: 1