Eric
Eric

Reputation: 1389

parsing date times in dplyr

I need to parse multiple dates and times. Here is an example:

library(dplyr)
library(tidyr)
library(lubridate)

date.called <- c("28-Feb","10/1/16","8/13/15")
call.times <- c("912","1513","922, 1810")
df<- data.frame(call.times,date.called,stringsAsFactors = F )
df <- df %>% 
separate(call.times, c("time.1", "time.2"), sep = "\\,", remove=FALSE) %>%
  mutate(time.1 = trimws(time.1), time.2 = trimws(time.2))
#parse each new date and time
df %>% mutate(time.1 = ifelse( !is.na(time.1) ,
                             parse_date_time(sprintf("%s %04s", date.called, 
                             gsub('^([0-9]+)([0-9]{2})$', '\\1:\\2', time.1)), 
                             c('%m/%d/%y %H:%M','%d-%b %H%M')), time.1),
    time.2 = ifelse( !is.na(time.2) ,
                             parse_date_time(sprintf("%s %04s", date.called, 
                             gsub('^([0-9]+)([0-9]{2})$', '\\1:\\2', time.2)), 
                             c('%m/%d/%y %H:%M','%d-%b %H%M')), time.2) )

I get the following output for df:

  call.times       time.1     time.2 date.called
1        912 -62162174880       <NA>      28-Feb
2       1513   1475334780       <NA>     10/1/16
3  922, 1810   1439457720 1439489400     8/13/15

Instead, I'm trying to generate this output:

  call.times              time.1              time.2 date.called
1        912 0000-02-28 09:12:00                <NA>      28-Feb
2       1513 2016-10-01 15:13:00                <NA>     10/1/16
3  922, 1810 2015-08-13 09:22:00 2015-08-13 18:10:00     8/13/15

Thanks for the help.

Upvotes: 2

Views: 1248

Answers (2)

h3rm4n
h3rm4n

Reputation: 4187

As LukeA said in the comments, ifelse strips the attributes and thus removes the date formatting. You can circumvent this behaviour by moving the parse_date_time function outside the ifelse. A modified version of your code gives then your desired result:

df %>% 
  mutate(time.1 = parse_date_time(ifelse(!is.na(time.2), 
                                         sprintf("%s %04s", date.called, gsub('^([0-9]+)([0-9]{2})$', '\\1:\\2', time.1)),
                                         NA), 
                                  c('%m/%d/%y %H:%M','%d-%b %H%M')),
         time.2 = parse_date_time(ifelse(!is.na(time.2), 
                                         sprintf("%s %04s", date.called, gsub('^([0-9]+)([0-9]{2})$', '\\1:\\2', time.2)),
                                         NA), 
                                  c('%m/%d/%y %H:%M','%d-%b %H%M'))
         )

the result:

  call.times              time.1              time.2 date.called
1        912 0000-02-28 09:12:00                <NA>      28-Feb
2       1513 2016-10-01 15:13:00                <NA>     10/1/16
3  922, 1810 2015-08-13 09:22:00 2015-08-13 18:10:00     8/13/15

Upvotes: 0

lukeA
lukeA

Reputation: 54237

I'd do

df %>% mutate_at(vars(starts_with("time")), function(x) 
  parse_date_time(
    sprintf("%s %04s", date.called, gsub('^([0-9]+)([0-9]{2})$', '\\1:\\2', x)), 
    c('%m/%d/%y %H:%M','%d-%b %H%M')
))
#   call.times              time.1              time.2 date.called
# 1        912 0000-02-28 09:12:00                <NA>      28-Feb
# 2       1513 2016-10-01 15:13:00                <NA>     10/1/16
# 3  922, 1810 2015-08-13 09:22:00 2015-08-13 18:10:00     8/13/15

Upvotes: 4

Related Questions