Reputation: 57
I would like to find the monthly usage of all the aircrafts(based on tailnum) lets say this is required for some kind of maintenance activity that needs to be done after x number of trips.
As of now i am doing it like below;
library(nycflights13)
N14228 <- filter(flights,tailnum=="N14228")
by_month <- group_by(N14228 ,month)
usage <- summarise(by_month,freq = n())
freq_by_months<- arrange(usage, desc(freq))
This has to be done for all aircrafts and for that the above approach wont work as there are 4044 distinct tailnums
I went through the dplyr vignette and found an example that comes very close to this but it is aimed at finding overall delays as shown below
flights %>%
group_by(year, month, day) %>%
select(arr_delay, dep_delay) %>%
summarise(
arr = mean(arr_delay, na.rm = TRUE),
dep = mean(dep_delay, na.rm = TRUE)
) %>%
filter(arr > 30 | dep > 30)
Apart from this i tried using aggregate and apply but couldnt get the desired results.
Upvotes: 0
Views: 568
Reputation: 1642
Check out the data.table
package.
library(data.table)
flt <- data.table(flights)
flt[, .N, by = c("tailnum", "month")]
tailnum month N
1: N14228 1 15
2: N24211 1 14
3: N619AA 1 1
4: N804JB 1 29
5: N668DN 1 4
---
37984: N225WN 9 1
37985: N528AS 9 1
37986: N3KRAA 9 1
37987: N841MH 9 1
37988: N924FJ 9 1
Here, the .N
means "count occurrence of".
Not sure if this is exactly what you're looking for, but regardless, for these kinds of counts, it's hard to beat data.table
for execution speed and syntactical simplicity.
Upvotes: 2