Reading dates columns using the fread function of the data.table package

Question

I am reading some .csv files that have non-standard format date format like m/d/Y and using the fread function the dates are read as character like m/d/y losing the four digit characters for the year. I tried using the colClasses format to read the date columns as dates but they are still read as character with a two character year and with a warning that the dates are not in a standard unambiguous format, see example below.

library(data.table)
library(dplyr)

set.seed(1)

DT <- data.table(orig_beg_date = sample(seq.Date(from = as.Date("1900-01-01"),
                                             to = Sys.Date(), by = "month"),
                                    5)) %>% 
mutate(beg_date = format(orig_beg_date, "%m/%d/%y"))

DT 
   orig_beg_date beg_date
             
1:    1984-09-01 09/01/84
2:    1956-07-01 07/01/56
3:    1910-09-01 09/01/10
4:    1977-06-01 06/01/77
5:    1939-03-01 03/01/39

When I format the beg_date column to date format I get the following:

DT$beg_date %>% as.Date(., format = "%m/%d/%y")
[1] "1984-09-01" "2056-07-01" "2010-09-01" "1977-06-01" "2039-03-01"

The year of the second date becomes 2056 from 1956. The third date also has the issue, it becomes 2010 from 1910. The csv file would show these two dates as 7/1/1956 and 9/1/2010. How can I read the dates correctly using the colClasses?

Reading dates columns using the fread function of the data.table package

Answers (1)

Related Questions