Valentin Franke
Valentin Franke

Reputation: 95

Ordering dates in R with lubridate

I have the following data frame and want to order it.

time <- c ("Feb (2019)", "Apr (2019)", "Jun (2019)" ,"Aug (2019)", "Oct (2019)", "Dec (2019)", "Feb (2020)" "Apr (2020)", "Jun (2020)", "Aug (2020)", "Oct (2020)", "Dec (2020)", "Jan (2019)", "Mar (2019)", "May (2019)", "Jul (2019)", "Sep (2019)", "Nov (2019)", "Jan (2020)", "Mar (2020)", "May (2020)", "Jul (2020)" "Sep (2020)", "Nov (2020)", "Jan (2021)")

To do this I use the following Code

library(lubridate)

time <- gsub("[()]", "", time)
time <- (my(time))
time <- as.data.frame(as.integer(gsub("[-]", "", time)))
arrange(time)

Unfortunately, I don't get the result I want:

1                           20190201
2                           20190401
3                           20190601
4                           20190801
5                           20191001
6                           20191201
7                           20200201
8                           20200401
9                           20200601
10                          20200801
11                          20201001
12                          20201201
13                          20190101
14                          20190301
15                          20190501
16                          20190701
17                          20190901
18                          20191101
19                          20200101
20                          20200301
21                          20200501
22                          20200701
23                          20200901
24                          20201101
25                          20210101

I tried various methods and nothing works. I would appreciate your advice!

Upvotes: 0

Views: 570

Answers (2)

Ronak Shah
Ronak Shah

Reputation: 389155

You can also use zoo::as.yearmon

time[order(zoo::as.yearmon(time, '%b (%Y)'))]

# [1] "Jan (2019)" "Feb (2019)" "Mar (2019)" "Apr (2019)" "May (2019)" "Jun (2019)"
# [7] "Jul (2019)" "Aug (2019)" "Sep (2019)" "Oct (2019)" "Nov (2019)" "Dec (2019)"
#[13] "Jan (2020)" "Feb (2020)" "Mar (2020)" "Apr (2020)" "May (2020)" "Jun (2020)"
#[19] "Jul (2020)" "Aug (2020)" "Sep (2020)" "Oct (2020)" "Nov (2020)" "Dec (2020)"
#[25] "Jan (2021)"

Upvotes: 2

wurli
wurli

Reputation: 2758

You can reorder the dates as follows:

library(dplyr)
library(readr)

time <- c("Feb (2019)", "Apr (2019)", "Jun (2019)", "Aug (2019)", "Oct (2019)", 
          "Dec (2019)", "Feb (2020)", "Apr (2020)", "Jun (2020)", "Aug (2020)", 
          "Oct (2020)", "Dec (2020)", "Jan (2019)", "Mar (2019)", "May (2019)", 
          "Jul (2019)", "Sep (2019)", "Nov (2019)", "Jan (2020)", "Mar (2020)", 
          "May (2020)", "Jul (2020)", "Sep (2020)", "Nov (2020)", "Jan (2021)")

tibble(time = time) %>% 
  arrange(parse_date(time, "%b (%Y)"))
#> # A tibble: 25 x 1
#>    time      
#>    <chr>     
#>  1 Jan (2019)
#>  2 Feb (2019)
#>  3 Mar (2019)
#>  4 Apr (2019)
#>  5 May (2019)
#>  6 Jun (2019)
#>  7 Jul (2019)
#>  8 Aug (2019)
#>  9 Sep (2019)
#> 10 Oct (2019)
#> # ... with 15 more rows

If you want to keep the parsed times you can use a call to mutate(), e.g.

tibble(time = time) %>% 
  mutate(parsed_time = parse_date(time, "%b (%Y)"))
  arrange(parsed_time)

Upvotes: 3

Related Questions