Abdulaziz Alshamsi
Abdulaziz Alshamsi

Reputation: 1

How to split 2 dates that are in one column into 2 columns in R

I have a date column that contains two dates (e.g Start Date 1/1/13 to End Date 12/31/13")and some rows contain different format like (e.g.1/1/13 to 12/31/13), (October to May)

I want to unify the format to MM/DD/YYYY and separate the start and end date into two columns.

How can I get rid of the characters and separate the two dates and place them into two separate columns like the picture attached?

Can this be achieved in R?

Upvotes: 0

Views: 616

Answers (2)

Waldi
Waldi

Reputation: 41210

You could use regular expressions with stringr and lubridate :

df <- data.frame(range = c("1/1/13 to 12/31/13",
                           "5/5/15 to 10/27/15"))

df$from <- lubridate::mdy(stringr::str_extract(df$range,"^.*?(?=to)"))  
df$to <- lubridate::mdy(stringr::str_extract(df$range,"(?=to).*?$")) 
df
#>                range       from         to
#> 1 1/1/13 to 12/31/13 2013-01-01 2013-12-31
#> 2 5/5/15 to 10/27/15 2015-05-05 2015-10-27

Created on 2020-09-20 by the reprex package (v0.3.0)

Or without converting to date :

library(dplyr)

df <- data.frame(range = c("1/1/13 to 12/31/13",
                           "5/5/15 to 10/27/15",
                           "October to November"))

df %>% mutate(from = stringr::str_extract(range,"^.*?(?= to)"),
              to = stringr::str_extract(range,"(?<=to ).*?$"))
#>                 range    from       to
#> 1  1/1/13 to 12/31/13  1/1/13 12/31/13
#> 2  5/5/15 to 10/27/15  5/5/15 10/27/15
#> 3 October to November October November

Created on 2020-09-20 by the reprex package (v0.3.0)

Upvotes: 1

Dylan_Gomes
Dylan_Gomes

Reputation: 2232

You should be able to do it with just regular expressions and back referencing Backreference in R.

dat<-data.frame(datestring=c("November to March","11/1/2001 to 12/8/2001"))

dat$from <- gsub("(.*) to (.*)","\\1",dat$datestring) # creates a new column 'from' that takes the first thing (before the 'to') 
dat$to <- gsub("(.*) to (.*)","\\2",dat$datestring) # creates a new column 'to' that takes the second thing (after the 'to')

dat

              datestring      from        to
1      November to March  November     March
2 11/1/2001 to 12/8/2001 11/1/2001 12/8/2001

Upvotes: 1

Related Questions