joker21
joker21

Reputation: 339

How to parse d character to different date format in R?

I have a data table as

 V1                V2                V3
1: 3 09/09/17 21:31:23 09/09/17 21:31:23
2: 3 09/09/17 22:06:47 09/09/17 22:06:47
3: 3 09/09/17 22:14:29 09/09/17 22:18:17

from this I want to tranform it to

 V1                V2                V3
1: 3 2017-09-09 21:31:23 2017-09-09 21:31:23
2: 3 2017-09-09 22:06:47 2017-09-09 22:06:47
3: 3 2017-09-09 22:14:29 2017-09-09 22:18:17

The V2 and V3 columns are characters.

I tried by

DT[,V2:=parse_date_time2(V2,orders="YmdHMS",tz="NA")]
DT[,V3:=parse_date_time2(V3,orders="YmdHMS",tz="NA")]
DT[,V2:=strftime(V2, format="%Y/%m/%d %H:%M:%S"]
DT[,V3:=strftime(V3, format="%Y/%m/%d %H:%M:%S"]

But did not work. Any help is appreciated.

Upvotes: 0

Views: 160

Answers (2)

Uwe
Uwe

Reputation: 42544

Using the wrong date time format is a very common error so I guess there must be a bunch of duplicate questions on SO.

However, The OP is asking for help for a combination of data.table and lubridate's parse_date_time2() function. parse_date_time2() is a fast C parser of numeric orders according to ?parse_date_time2.

library(data.table)
library(lubridate)
options(datatable.print.class = TRUE)

DT[, V2 := parse_date_time2(V2, orders = "mdyHMS")]
DT[, V3 := parse_date_time2(V3, orders = "mdyHMS")]
DT
      V1                  V2                  V3
   <int>              <POSc>              <POSc>
1:     3 2017-09-09 21:31:23 2017-09-09 21:31:23
2:     3 2017-09-09 22:06:47 2017-09-09 22:06:47
3:     3 2017-09-09 22:14:29 2017-09-09 22:18:17

The presumably correct format string "mdyHMS" is passed to orders while the OP used "YmdHMS". (Perhaps, the OP has specified the order of date formats in the desired output but not as they appear in the input?)

I say presumably correct format string because the date string 09/09/17 is ambiguous without additional information. The string can be interpreted as mdy, dmy, ymd, or myd.

The whole operation can be coded more concisely as

cols = c("V2", "V3")
DT[, (cols) := lapply(.SD, mdy_hms), .SDcols = cols]

Data

library(data.table)
DT <- fread("
3, 09/09/17 21:31:23, 09/09/17 21:31:23
3, 09/09/17 22:06:47, 09/09/17 22:06:47
3, 09/09/17 22:14:29, 09/09/17 22:18:17")

Upvotes: 1

Jake Kaupp
Jake Kaupp

Reputation: 8072

Using dplyr and lubridate:

library(lubridate)
library(dplyr)

df %>% 
  mutate_at(2:3, dmy_hms)

    # A tibble: 3 x 3
     V1                  V2                  V3
  <chr>              <dttm>              <dttm>
1  1: 3 2017-09-09 21:31:23 2017-09-09 21:31:23
2  2: 3 2017-09-09 22:06:47 2017-09-09 22:06:47
3  3: 3 2017-09-09 22:14:29 2017-09-09 22:18:17

Upvotes: 0

Related Questions