vinay
vinay

Reputation: 57

Find monthly plane/aircraft usage from the nycflights13 data set

I would like to find the monthly usage of all the aircrafts(based on tailnum) lets say this is required for some kind of maintenance activity that needs to be done after x number of trips.

As of now i am doing it like below;

library(nycflights13)
    N14228  <- filter(flights,tailnum=="N14228")
    by_month <- group_by(N14228 ,month)
    usage <- summarise(by_month,freq = n())
    freq_by_months<- arrange(usage, desc(freq))

This has to be done for all aircrafts and for that the above approach wont work as there are 4044 distinct tailnums

I went through the dplyr vignette and found an example that comes very close to this but it is aimed at finding overall delays as shown below

    flights %>%
  group_by(year, month, day) %>%
  select(arr_delay, dep_delay) %>%
  summarise(
    arr = mean(arr_delay, na.rm = TRUE),
    dep = mean(dep_delay, na.rm = TRUE)
  ) %>%
  filter(arr > 30 | dep > 30)

Apart from this i tried using aggregate and apply but couldnt get the desired results.

Upvotes: 0

Views: 568

Answers (1)

Alexey Shiklomanov
Alexey Shiklomanov

Reputation: 1642

Check out the data.table package.

library(data.table)
flt <- data.table(flights)
flt[, .N, by = c("tailnum", "month")]
       tailnum month  N
    1:  N14228     1 15
    2:  N24211     1 14
    3:  N619AA     1  1
    4:  N804JB     1 29
    5:  N668DN     1  4
   ---                 
37984:  N225WN     9  1
37985:  N528AS     9  1
37986:  N3KRAA     9  1
37987:  N841MH     9  1
37988:  N924FJ     9  1

Here, the .N means "count occurrence of".

Not sure if this is exactly what you're looking for, but regardless, for these kinds of counts, it's hard to beat data.table for execution speed and syntactical simplicity.

Upvotes: 2

Related Questions