Krishangi Goswami
Krishangi Goswami

Reputation: 69

How to sort monthly data from dates using R?

Part of my dataset

This is airline dataset from 2014 to 2018 with several Carriers flying on a certain date.

From this, I want a count of the CANCELLATION - which is a column with only binary data, where 0- not canceled and 1- canceled, grouped by OP_CARRIER, monthly.

I am new to R. I am able to just do these operations separately like the count using table(), and group by for OP_CARRIER.

Any help will be much appreciated. Thank you.

Upvotes: 0

Views: 65

Answers (3)

jeffverboon
jeffverboon

Reputation: 336

you need to make a month column (I am assuming your date column is currently just a string).

df %>% mutate(FL_DATE = as.POSIXct(FL_DATE) %>%
   mutate(month= format(FL_DATE,"%B") %>%
   group_by(month, OP_CARRIER) %>%
   summarise(cancelations = sum(CANCELLATION))

this will do everything per month over multiple years so if you want per year add mutate(year= format(FL_DATE,"%Y")) in there and edit the group_by(month, year, OP_CARRIER)

Upvotes: 2

Hansel Palencia
Hansel Palencia

Reputation: 1046

Using dplyr

library(dplyr)

df %>% 
  group_by(carrier, cancellation, month = month(as.Date(FL_DATE)) %>% 
  summarise(count = n())

Upvotes: 1

akrun
akrun

Reputation: 887851

One option is rowsum in base R as CANCELLATION is a binary variable

rowsum(df1$CANCELLATION, group = df1$OP_CARRIER)

In dplyr. If we also need month

library(dplyr)
library(lubridate)
df1 %>%
    group_by(OP_CARRIER, month = month(as.Date(FL_DATE))) %>%
    summarise(CANCELLATION = sum(CANCELLATION))

Upvotes: 1

Related Questions