Reputation: 1484
I have a data and there is a column called Date
, I input the data into R.
Here is my data:
unique(data$Date)
[1] "" "2016/12/20" "2016/12/27" "2017/1/7" "2017/1/27" "2017/2/1" "2017/2/2" "2017/2/5" "2017/2/6" "2017/2/7"
[11] "2017/2/8" "2017/2/10" "2017/2/11" "2017/2/13" "2017/2/14" "2017/2/15" "2017/2/17" "2017/2/16" "2017/2/24" "2017/2/19"
[21] "2017/2/21" "2017/2/20" "2017/2/26" "2017/2/22" "2017/3/2" "2017/2/25" "2017/2/28" "2017/3/1" "2017/3/4" "2017/3/5"
[31] "2017/3/6" "2017/3/10" "2017/3/8" "2017/3/9" "2017/3/11" "2017/3/12" "2017/3/13" "2017/3/15" "2017/3/29" "2017/5/13"
[41] "2015/10/5" "2016/2/22" "2015/3/6" "2015/3/7" "2015/10/15" "2015/3/9" "2016/1/30" "2015/10/29" "2015/10/24" "2015/10/17"
[51] "2016/1/8" "2015/9/24" "2016/2/15" "2015/12/8" "2015/12/10" "2016/2/6" "2015/11/29" "2016/1/23" "2015/10/11" "2016/2/16"
[61] "2015/9/28" "2016/1/29" "2015/11/27" "2015/10/12" "2015/11/1" "2015/11/16" "2015/10/10" "2015/11/30" "2016/1/2" "2016/1/21"
[71] "2016/4/22" "2015/10/21" "2015/11/12" "2015/12/28" "2015/12/30" "2015/11/6" "2015/10/8" "2015/12/6" "2016/1/24" "2016/1/17"
[81] "2016/2/26" "2016/3/6" "2016/2/17" "2016/1/11" "2015/12/3" "2016/2/11" "2015/11/22" "2015/10/2" "2015/10/3" "2015/11/4"
[91] "2016/2/10" "2015/12/9" "2015/10/9" "2015/12/1" "2016/2/25" "2016/1/19" "2016/1/18" "2015/12/13" "2016/2/14" "2016/3/10"
class(data$Date)
[1] "character"
I change this character
to date
format with as.Date()
:
data$Date <- as.Date(data$Date)
Error in charToDate(x) :
character string is not in a standard unambiguous format
I don't know how to figure out. I think the problem is ""
in the data. I also has another column called Date2
but this column doesn't contain ""
in it.
Any suggestion?
Also, If I want to do as.Date
with two columns at the same time and define specified format like `as.Date(x, "%Y/%m/%d"), how can I do?
data[,c("Date", "Date2") := lapply(.SD, as.Date), .SDcols = c("Date", "Date2")]
Upvotes: 1
Views: 913
Reputation: 26248
As established in the comments, the answer is that you need to specify the format of the date you're converting. Which in your case is "%Y/%m/%d"
data$Date <- as.Date(data$Date, "%Y/%m/%d")
The reason you need to do this is because the first entry in your vector is ""
, AND you haven't specified the format.
The as.Date
function, when applied on a character, first checks to see if the format
argument is missing. If it is, it tries to guess the format based on the first element of the vector.
It tests for both "%Y-%m-%d"
AND "%Y/%m/%d"
formats by doing
xx <- ""
strptime(xx, "%Y-%m-%d")
NA
strptime(xx, "%Y/%m/%d")
NA
More specifically, it uses the following test (where xx
is the first element of your vector)
if(is.na(xx) ||
!is.na(strptime(xx, f <- "%Y-%m-%d", tz = "GMT")) ||
!is.na(strptime(xx, f <- "%Y/%m/%d", tz = "GMT"))){
print("success!") ## I added this print statement for illustration purposes
}else{
stop("character string is not in a standard unambiguous format")
}
So as you can see, xx
evaluates to FALSE
in all the if
conditions, so the function has to enter the stop
method.
To demonstrate, see the results of these statements
as.Date(c("2015/10/5", ""))
# [1] "2015-10-05" NA
## SUCCESS, because it can 'guess' the first entry's format
as.Date(c("", "2015/10/5"))
## ERROR: can't 'guess' the first entry's format
as.Date(c("2015/10/5", ""), format = "%Y/%m/%d")
# [1] "2015-10-05" NA
## SUCCESS, because you've specified the format
as.Date(c("2015-10/5", ""))
## ERROR: you haven't specified the format,
## AND it's not one of the 'guessed' options ("%Y-%m-%d", "%Y/%m/%d")
Upvotes: 2
Reputation: 147
You can use the ymd()
function in the lubridate
package to convert the dates, and the empty strings will be converted to NA. For example,
> library(lubridate)
> (newdates <- ymd(dates))
[1] NA "2016-12-20" "2016-12-27" "2017-01-07" "2017-01-27" "2017-02-01" "2017-02-02" "2017-02-05"
[9] "2017-02-06" "2017-02-07" "2017-02-08" "2017-02-10" "2017-02-11" "2017-02-13" "2017-02-14" "2017-02-15"
[17] "2017-02-17" "2017-02-16" "2017-02-24" "2017-02-19" "2017-02-21" "2017-02-20" "2017-02-26" "2017-02-22"
[25] "2017-03-02" "2017-02-25" "2017-02-28" "2017-03-01" "2017-03-04" "2017-03-05" "2017-03-06" "2017-03-10"
[33] "2017-03-08" "2017-03-09" "2017-03-11" "2017-03-12" "2017-03-13" "2017-03-15" "2017-03-29" "2017-05-13"
[41] "2015-10-05" "2016-02-22" "2015-03-06" "2015-03-07" "2015-10-15" "2015-03-09" "2016-01-30" "2015-10-29"
[49] "2015-10-24" "2015-10-17" "2016-01-08" "2015-09-24" "2016-02-15" "2015-12-08" "2015-12-10" "2016-02-06"
[57] "2015-11-29" "2016-01-23" "2015-10-11" "2016-02-16" "2015-09-28" "2016-01-29" "2015-11-27" "2015-10-12"
[65] "2015-11-01" "2015-11-16" "2015-10-10" "2015-11-30" "2016-01-02" "2016-01-21" "2016-04-22" "2015-10-21"
[73] "2015-11-12" "2015-12-28" "2015-12-30" "2015-11-06" "2015-10-08" "2015-12-06" "2016-01-24" "2016-01-17"
[81] "2016-02-26" "2016-03-06" "2016-02-17" "2016-01-11" "2015-12-03" "2016-02-11" "2015-11-22" "2015-10-02"
[89] "2015-10-03" "2015-11-04" "2016-02-10" "2015-12-09" "2015-10-09" "2015-12-01" "2016-02-25" "2016-01-19"
[97] "2016-01-18" "2015-12-13" "2016-02-14" "2016-03-10"
> is.Date(newdates)
[1] TRUE
Upvotes: 2