cJc
cJc

Reputation: 863

Converting dates from imported CSV file

I'm importing time series data from a CSV file and one of the vectors/columns are dates in the format DD/MM/YYYY. Vector class is characters or factors if I chose the Strings as factors = True. I convert the imported file to a data frame and then run the following:

 df$Date <- as.Date(df$Date , "%d/%m/%y")

I get no error message, but the dates are all messed up in the format YYYYMMDD and all the YYYY are the year 2020...

Before:
10/09/2009
11/09/2009
14/09/2009

After:
2020-09-10
2020-09-11
2020-09-14

Upvotes: 0

Views: 2903

Answers (2)

Brant Mullinix
Brant Mullinix

Reputation: 137

You are using %y when it should be %Y. See the documentation here.

%y Year without century (00–99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is the behaviour specified by the 2004 and 2008 POSIX standards, but they do also say ‘it is expected that in a future version the default century inferred from a 2-digit year will change’.

%Y Year with century. Note that whereas there was no zero in the original Gregorian calendar, ISO 8601:2004 defines it to be valid (interpreted as 1BC): see http://en.wikipedia.org/wiki/0_(year). Note that the standards also say that years before 1582 in its calendar should only be used with agreement of the parties involved.

Try running the code again so that the data frame is not modified by any previous attempt but this time use

 df$Date <- as.Date(df$Date , "%d/%m/%Y")

Upvotes: 1

Samy Geronymos
Samy Geronymos

Reputation: 405

@Heroka is right.

If ever you need it you could also use posixct objects (they contain information of seconds)

Try this:

df$Date.time <- as.POSIXct(df$Date , format="%d/%m/%Y")

If you want the date and time in strings you can try the following:

df$Date.time <- format(as.POSIXct(df$Date , format="%d/%m/%Y"),format="%Y-%m-%d %H:%M")

or

df$Date <- format(as.POSIXct(df$Date , format="%d/%m/%Y"),format="%Y-%m-%d")

Upvotes: 0

Related Questions