jeffrey
jeffrey

Reputation: 3354

Is there a way to check if a column is a Date in R?

This is a pseudo followup to this question: Why is ggplot graphing null percentage data points?

Let's say this is my dataset:

Date        AE      AA      AEF     Percent
1/1/2012    1211    1000    3556    0.03
1/2/2012    100     2000    3221    0.43
1/3/2012    3423    10000   2343    0.54
1/4/2012    10000   3000    332     0.43
1/5/2012    2342    500     4435    0.43
1/6/2012    2342    800     2342    0.23
1/7/2012    2342    1500    1231    0.12
1/8/2012    111     2300    333 
1/9/2012    1231    1313    3433    
1/10/2012   3453    5654    222 
1/11/2012   3453    3453    454 
1/12/2012   5654    7685    3452 

> str(data)
'data.frame':   12 obs. of  5 variables:
 $ Date   : Factor w/ 12 levels "10/11/2012","10/12/2012",..: 1 2 3 4 5 6 7 8 9 10 ...
 $ AE     : int  1211 100 3423 10000 2342 2342 2342 111 1231 3453 ...
 $ AA     : int  1000 2000 10000 3000 500 800 1500 2300 1313 5654 ...
 $ AEF    : int  3556 3221 2343 332 4435 2342 1231 333 3433 222 ...
 $ Percent: num  0.03 0.43 0.54 0.43 0.43 0.23 0.12 NA NA NA ...

I need something to tell that the 'Date' column is a Date type as opposed to a numeric or character type (this is because I have to convert the 'Date' column of the data input into an actual Date with as.Date(), ASSSUMING that I do not know the column names of the data set).

is.numeric(data[[1]]) returns False
is.character(data[[1]]) returns False

I made the 'Date' column in Excel, formatting the column in the 'Date' format, then saved the file as a csv. What type is this in R? I seek an expression similar to the above that returns TRUE.

Upvotes: 42

Views: 72472

Answers (9)

cs-
cs-

Reputation: 21

I know this is an old question and that I am not providing a self-developed answer, but perhaps some non-experts in R (like myself) could find useful to use the skim function (of the skimr package) to check whether one or more variables of a data frame (df) are in the Date format.

The syntax is just: skim (df) or skimr::skim(df), if one does not want to load the package for just doing this check.

The obtained output is a very detailed summary of the data frame in which variables are grouped by format (character, Date, numeric...) and additional info is provided (e.g. whether there are missing values, descriptive statistics, etc).

Upvotes: 0

MS Berends
MS Berends

Reputation: 5229

The OP clearly asks for just a check:

I need something to tell that the 'Date' column is a Date type

So how many date classes come with R? Exactly two: Date and POSIXt (excluding their derivatives like POSIXct and POSIXlt).

So we can just check on that, and make it more robust than the answers already given:

is.Date <- function(x) {
  inherits(x, c("Date", "POSIXt"))
}

As robust as it gets.

is.Date(as.Date("2020-02-02"))
#> [1] TRUE
is.Date(as.POSIXct("2020-02-02"))
#> [1] TRUE
is.Date(as.POSIXlt("2020-02-02"))
#> [1] TRUE

If you want to know if a column would be successfully tranformable/coercible to a Date type, then that's another question. This is as requested for: 'to tell that [...] is a Date type'.

Upvotes: 8

Dimitrios Zacharatos
Dimitrios Zacharatos

Reputation: 850

I will refer to a simple example and I hope it can be generalized. Say that you have a date

d1<-Sys.Date()
d1

"2020-02-12"

deparse(d1)

"structure(18304, class = \"Date\")"

Thus

grep("Date",deparse(d1))>=1

TRUE

alternatively use

class(d1)

"Date"

Upvotes: 2

Sophia J
Sophia J

Reputation: 105

This is my way of doing it. Works most of the time but needs improvement

MissLt <- function(x, ratio = 0.5){
  sum(is.na(x))/length(x) < ratio
}


IS.Date  <- function(x, addformat = NULL, exactformat = NULL){
  if (is.null(exactformat)){
    format = c("%m/%d/%Y", "%m-%d-%Y","%Y/%m/%d" ,"%Y-%m-%d", addformat) 
    y <- as.Date(as.character(x),format= format)
    MissLt(y,ratio = 1-(1/length(y)))}
  else{
    y <- as.Date(as.character(x),format= exactformat)
    MissLt(y,ratio = 1-(1/length(y)))}
}
sapply(data, IS.Date)

Upvotes: -1

Kerry Jackson
Kerry Jackson

Reputation: 1871

I know this question is old, but I did want to mention that there is now a function in the lubridate package for is.Date and also is.POSIXt

sapply(list(as.Date('2000-01-01'), 123, 'ABC'), is.Date)
[1]  TRUE FALSE FALSE

Upvotes: 21

ImmoXZ
ImmoXZ

Reputation: 75

Function that i created based on answers here, and using now

is.Date <- function(date) {
  if (sapply(date, function(x)
     ! all(is.na(as.Date(
     as.character(x),
     format = c("%d/%m/%Y", "%d-%m-%Y", "%Y/%m/%d", "%Y-%m-%d")
     ))))) {
    return(TRUE)
  } else{
    return(FALSE)
  }
}

Upvotes: 2

Fernando
Fernando

Reputation: 161

To work with dates I use a function to identify if the strings are dates, and if they are, convert them to a predefined format (in this case I choose ''%d/%m/%Y'):

standarDates <- function(string) {
  patterns = c('[0-9][0-9][0-9][0-9]/[0-9][0-9]/[0-9][0-9]','[0-9][0-9]/[0-9][0-9]/[0-9][0-9][0-9][0-9]','[0-9][0-9][0-9][0-9]-[0-9][0-9]-[0-9][0-9]')
  formatdates = c('%Y/%m/%d','%d/%m/%Y','%Y-%m-%d')
  standardformat='%d/%m/%Y'
  for(i in 1:3){
    if(grepl(patterns[i], string)){
      aux=as.Date(string,format=formatdates[i])
      if(!is.na(aux)){
        return(format(aux, standardformat))
      }
    }
  }
  return(FALSE)
}

Suppose you have the vector

a=c("2018-24-16","1587/03/16","fhjfmk","9885/04/16")

> sapply(a,standarDates)
2018-24-16   1587/03/16       fhjfmk   9885/04/16 
  "FALSE"   "16/03/1587"      "FALSE" "16/04/9885"

with the command

"FALSE"%in%sapply(a,standarDates)
[1] True

you can figure out if all the elements are dates.

The advantage of this function is that you can add more patterns and date formats according to the data with you are working and end with a standard format also for all those dates. (The disadvantage is that it isn't exactly what the question is asking)

I hope this helps

Upvotes: 4

Eldar
Eldar

Reputation: 5237

Use inherits to detect if argument has datatype Date:

is.date <- function(x) inherits(x, 'Date')

sapply(list(as.Date('2000-01-01'), 123, 'ABC'), is.date)
#[1]  TRUE FALSE FALSE

If you want to check if character argument can be converted to Date then use this:

is.convertible.to.date <- function(x) !is.na(as.Date(as.character(x), tz = 'UTC', format = '%Y-%m-%d'))

sapply(list('2000-01-01', 123, 'ABC'), is.convertible.to.date)
# [1]  TRUE FALSE FALSE

Upvotes: 43

thelatemail
thelatemail

Reputation: 93938

You could try to coerce all the columns to as.Date and see which ones succeed. You would need to specify the format you expect dates to be in though. E.g.:

data <- data.frame(
  Date=c("10/11/2012","10/12/2012"),
  AE=c(1211,100),
  Percent=c(0.03,0.43)
)

sapply(data, function(x) !all(is.na(as.Date(as.character(x),format="%d/%m/%Y"))))
#Date      AE Percent 
#TRUE   FALSE   FALSE 

Upvotes: 19

Related Questions