Peter Chen
Peter Chen

Reputation: 1484

Character format cannot change to Date format

I have a data and there is a column called Date, I input the data into R.
Here is my data:

unique(data$Date)
 [1] ""           "2016/12/20" "2016/12/27" "2017/1/7"   "2017/1/27"  "2017/2/1"   "2017/2/2"   "2017/2/5"   "2017/2/6"   "2017/2/7"  
[11] "2017/2/8"   "2017/2/10"  "2017/2/11"  "2017/2/13"  "2017/2/14"  "2017/2/15"  "2017/2/17"  "2017/2/16"  "2017/2/24"  "2017/2/19" 
[21] "2017/2/21"  "2017/2/20"  "2017/2/26"  "2017/2/22"  "2017/3/2"   "2017/2/25"  "2017/2/28"  "2017/3/1"   "2017/3/4"   "2017/3/5"  
[31] "2017/3/6"   "2017/3/10"  "2017/3/8"   "2017/3/9"   "2017/3/11"  "2017/3/12"  "2017/3/13"  "2017/3/15"  "2017/3/29"  "2017/5/13" 
[41] "2015/10/5"  "2016/2/22"  "2015/3/6"   "2015/3/7"   "2015/10/15" "2015/3/9"   "2016/1/30"  "2015/10/29" "2015/10/24" "2015/10/17"
[51] "2016/1/8"   "2015/9/24"  "2016/2/15"  "2015/12/8"  "2015/12/10" "2016/2/6"   "2015/11/29" "2016/1/23"  "2015/10/11" "2016/2/16" 
[61] "2015/9/28"  "2016/1/29"  "2015/11/27" "2015/10/12" "2015/11/1"  "2015/11/16" "2015/10/10" "2015/11/30" "2016/1/2"   "2016/1/21" 
[71] "2016/4/22"  "2015/10/21" "2015/11/12" "2015/12/28" "2015/12/30" "2015/11/6"  "2015/10/8"  "2015/12/6"  "2016/1/24"  "2016/1/17" 
[81] "2016/2/26"  "2016/3/6"   "2016/2/17"  "2016/1/11"  "2015/12/3"  "2016/2/11"  "2015/11/22" "2015/10/2"  "2015/10/3"  "2015/11/4" 
[91] "2016/2/10"  "2015/12/9"  "2015/10/9"  "2015/12/1"  "2016/2/25"  "2016/1/19"  "2016/1/18"  "2015/12/13" "2016/2/14"  "2016/3/10" 

class(data$Date)
[1] "character"

I change this character to date format with as.Date():

data$Date <- as.Date(data$Date)
Error in charToDate(x) : 
character string is not in a standard unambiguous format

I don't know how to figure out. I think the problem is "" in the data. I also has another column called Date2 but this column doesn't contain "" in it.
Any suggestion?

Also, If I want to do as.Date with two columns at the same time and define specified format like `as.Date(x, "%Y/%m/%d"), how can I do?

data[,c("Date", "Date2") := lapply(.SD, as.Date), .SDcols = c("Date", "Date2")]

Upvotes: 1

Views: 913

Answers (2)

SymbolixAU
SymbolixAU

Reputation: 26248

As established in the comments, the answer is that you need to specify the format of the date you're converting. Which in your case is "%Y/%m/%d"

data$Date <- as.Date(data$Date, "%Y/%m/%d")

Explanation

The reason you need to do this is because the first entry in your vector is "", AND you haven't specified the format.

The as.Date function, when applied on a character, first checks to see if the format argument is missing. If it is, it tries to guess the format based on the first element of the vector.

It tests for both "%Y-%m-%d" AND "%Y/%m/%d" formats by doing

xx <- ""
strptime(xx, "%Y-%m-%d")
NA
strptime(xx, "%Y/%m/%d")
NA

More specifically, it uses the following test (where xx is the first element of your vector)

if(is.na(xx) || 
    !is.na(strptime(xx, f <- "%Y-%m-%d", tz = "GMT")) || 
    !is.na(strptime(xx, f <- "%Y/%m/%d", tz = "GMT"))){
  print("success!")     ## I added this print statement for illustration purposes
}else{
  stop("character string is not in a standard unambiguous format")
}

So as you can see, xx evaluates to FALSE in all the if conditions, so the function has to enter the stop method.

To demonstrate, see the results of these statements

as.Date(c("2015/10/5", ""))  
# [1] "2015-10-05" NA
## SUCCESS, because it can 'guess' the first entry's format

as.Date(c("", "2015/10/5"))  
## ERROR: can't 'guess' the first entry's format

as.Date(c("2015/10/5", ""), format = "%Y/%m/%d") 
# [1] "2015-10-05" NA
## SUCCESS, because you've specified the format

as.Date(c("2015-10/5", "")) 
## ERROR: you haven't specified the format, 
## AND it's not one of the 'guessed' options ("%Y-%m-%d", "%Y/%m/%d")

Upvotes: 2

jdb
jdb

Reputation: 147

You can use the ymd() function in the lubridate package to convert the dates, and the empty strings will be converted to NA. For example,

> library(lubridate)
> (newdates <- ymd(dates))
   [1] NA           "2016-12-20" "2016-12-27" "2017-01-07" "2017-01-27" "2017-02-01" "2017-02-02" "2017-02-05"
   [9] "2017-02-06" "2017-02-07" "2017-02-08" "2017-02-10" "2017-02-11" "2017-02-13" "2017-02-14" "2017-02-15"
   [17] "2017-02-17" "2017-02-16" "2017-02-24" "2017-02-19" "2017-02-21" "2017-02-20" "2017-02-26" "2017-02-22"
   [25] "2017-03-02" "2017-02-25" "2017-02-28" "2017-03-01" "2017-03-04" "2017-03-05" "2017-03-06" "2017-03-10"
   [33] "2017-03-08" "2017-03-09" "2017-03-11" "2017-03-12" "2017-03-13" "2017-03-15" "2017-03-29" "2017-05-13"
   [41] "2015-10-05" "2016-02-22" "2015-03-06" "2015-03-07" "2015-10-15" "2015-03-09" "2016-01-30" "2015-10-29"
   [49] "2015-10-24" "2015-10-17" "2016-01-08" "2015-09-24" "2016-02-15" "2015-12-08" "2015-12-10" "2016-02-06"
   [57] "2015-11-29" "2016-01-23" "2015-10-11" "2016-02-16" "2015-09-28" "2016-01-29" "2015-11-27" "2015-10-12"
   [65] "2015-11-01" "2015-11-16" "2015-10-10" "2015-11-30" "2016-01-02" "2016-01-21" "2016-04-22" "2015-10-21"
   [73] "2015-11-12" "2015-12-28" "2015-12-30" "2015-11-06" "2015-10-08" "2015-12-06" "2016-01-24" "2016-01-17"
   [81] "2016-02-26" "2016-03-06" "2016-02-17" "2016-01-11" "2015-12-03" "2016-02-11" "2015-11-22" "2015-10-02"
   [89] "2015-10-03" "2015-11-04" "2016-02-10" "2015-12-09" "2015-10-09" "2015-12-01" "2016-02-25" "2016-01-19"
   [97] "2016-01-18" "2015-12-13" "2016-02-14" "2016-03-10"
> is.Date(newdates)
 [1] TRUE    

Upvotes: 2

Related Questions