Dan
Dan

Reputation: 143

Incorrect Conversion of Date as a Factor to a Date

I am having trouble calculating a date that is imported in from a .csv file. What I want to do is take that date in the factor DateClosed and generate a date in a date field (a). Example if a=203 I want the date to be the equivalent of DateClosed-203. However, I am having trouble with the code listed below.

DateClose is a factor.

> head(DateClosed)
[1] 7/30/2007  12/12/2007 5/8/2009   6/24/2009  6/24/2009  2/29/2008 
165 Levels: 1/12/2010 1/15/2011 1/15/2013 1/17/2009 1/18/2008 1/19/2012 1/2/2013 1/21/2013 1/22/2010 1/24/2013 1/26/2014 ... 9/7/2010
> head(as.Date(DateClosed,format="%m/%d/%y"))
[1] "2020-07-30" "2020-12-12" "2020-05-08" "2020-06-24" "2020-06-24" "2020-02-29"

 head(as.Date(DateClosed,format="%m/%d/%y"))-203
[1] "2020-01-09" "2020-05-23" "2019-10-18" "2019-12-04" "2019-12-04" "2019-08-10"

It subtracts 203 days correctly but for some reason reads the date wrong.

Upvotes: 4

Views: 6989

Answers (2)

agstudy
agstudy

Reputation: 121568

Manipulating dates becomes really easy using lubridate package.

mdy(factor(c("7/30/2007","12/12/2007", "5/8/2009")))

"2007-07-30 UTC" "2007-12-12 UTC" "2009-05-08 UTC"

Or using parse_date_time with the same package:

parse_date_time(factor(c("7/30/2007","12/12/2007", "5/8/2009")),c('mdY'))
[1] "2007-07-30 UTC" "2007-12-12 UTC" "2009-05-08 UTC"

Upvotes: 0

BrodieG
BrodieG

Reputation: 52637

DateClosed <- factor(c("7/30/2007","12/12/2007", "5/8/2009"))
as.Date(DateClosed, format="%m/%d/%Y")

Produces:

[1] "2007-07-30" "2007-12-12" "2009-05-08"

Notice the capital "Y" in the format param. The lower case "y" is for 2 digit years, so as.Date reads the first two digits of the year token ("20"), and then assumes that refers to just the last two digits of the year, and adds the current date's century (also "20"), so you end up with dates in 2020.

Upvotes: 9

Related Questions