James Ho
James Ho

Reputation: 17

R language the result will be NA or different after using as.Date

I am doing the EDA Case Study and I face a date converting issue. The number will be different after using as.Date function.

dates <- pm0$Date
> str(dates)
 int [1:1304287] 20120101 20120104 20120107 20120110 20120113 20120116 20120119 20120122 20120125 20120128 ...

dates <- as.Date(as.character(dates), "%y%m%d")
> str(dates)
 Date[1:1304287], format: "2020-12-01" "2020-12-01" "2020-12-01" "2020-12-01" "2020-12-01" ...

## the date is changed from 20120101 to 20201201 and all the results are the same

## If i modify the code to be (format = "%y-%m-%d") the result will be NA

Upvotes: 1

Views: 43

Answers (1)

akrun
akrun

Reputation: 887951

The %y specifies for 2 digit year, we need %Y for 4 digit year.

as.Date("20120101", "%Y%m%d")
#[1] "2012-01-01"

If we do %y, it matches the first two digits 20 as year, and prefix with 20 as by default, it is appending the current era, then the month matches the "12" and day the next 01, leaving the last 01 out

as.Date("20120101", "%y%m%d")
#[1] "2020-12-01"

It is also documented in the ?strptime

%y Year without century (00–99). On input, values 00 to 68 are prefixed by 20 and 69 to 99 by 19 – that is the behaviour specified by the 2004 and 2008 POSIX standards, but they do also say ‘it is expected that in a future version the default century inferred from a 2-digit year will change’.

%Y Year with century. Note that whereas there was no zero in the original Gregorian calendar, ISO 8601:2004 defines it to be valid (interpreted as 1BC): see https://en.wikipedia.org/wiki/0_(year). Note that the standards also say that years before 1582 in its calendar should only be used with agreement of the parties involved.

Though, the format is not documented in ?as.Date, it gives a link to strptime to check for format

format
character string. If not specified, it will try tryFormats one by one on the first non-NA element, and give an error if none works. Otherwise, the processing is via strptime.

Upvotes: 5

Related Questions