Reputation: 17
I am doing the EDA Case Study and I face a date converting issue. The number will be different after using as.Date function.
dates <- pm0$Date
> str(dates)
int [1:1304287] 20120101 20120104 20120107 20120110 20120113 20120116 20120119 20120122 20120125 20120128 ...
dates <- as.Date(as.character(dates), "%y%m%d")
> str(dates)
Date[1:1304287], format: "2020-12-01" "2020-12-01" "2020-12-01" "2020-12-01" "2020-12-01" ...
## the date is changed from 20120101 to 20201201 and all the results are the same
## If i modify the code to be (format = "%y-%m-%d") the result will be NA
Upvotes: 1
Views: 43
Reputation: 887951
The %y
specifies for 2 digit year, we need %Y
for 4 digit year.
as.Date("20120101", "%Y%m%d")
#[1] "2012-01-01"
If we do %y
, it matches the first two digits 20
as year, and prefix with 20
as by default, it is appending the current era, then the month matches the "12" and day the next 01, leaving the last 01 out
as.Date("20120101", "%y%m%d")
#[1] "2020-12-01"
It is also documented in the ?strptime
%y Year without century (00–99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is the behaviour specified by the 2004 and 2008 POSIX standards, but they do also say ‘it is expected that in a future version the default century inferred from a 2-digit year will change’.
%Y Year with century. Note that whereas there was no zero in the original Gregorian calendar, ISO 8601:2004 defines it to be valid (interpreted as 1BC): see https://en.wikipedia.org/wiki/0_(year). Note that the standards also say that years before 1582 in its calendar should only be used with agreement of the parties involved.
Though, the format
is not documented in ?as.Date
, it gives a link to strptime
to check for format
format
character string. If not specified, it will try tryFormats one by one on the first non-NA element, and give an error if none works. Otherwise, the processing is via strptime.
Upvotes: 5