Reputation: 3
I am working on a project that involves using R to clean up date entries from an excel file in order to upload them to a database. Within the excel file, some of the dates are already in a date format, (i.e. "8/18/2020"), while others are not (i.e. "8/18/2020 2027"). I want to import the column of dates into R so I can run code to reformat the "8/18/2020 2027"-like cells. I am currently using the read_excel() command from the readxl library to import the excel files, using similar syntax to this:
read_excel("filepath", col_types = c("date"))
The issue that I am running into is that if I define the column as a date column, readexcel() will not recognize the "8/18/2020 2027"-like cells as valid dates and won't import those cells.
On the other hand, if I define the column as a text column, readexcel() will instead simplify the "8/18/2020"-like cells into a different number (44063).
Is there a way to import the entire row into R with its entries untouched?
Upvotes: 0
Views: 341
Reputation: 123893
Using the answer to this question you can read the column as text and convert the numbers representing the date to a character like so:
# Read xl file. Read as text
# foo <- readxl::read_excel("test.xlsx", col_names = "date")
# dput of test.xlsx
foo <- structure(list(date = c("44061", "8/18/2020 2027")), row.names = c(NA,
-2L), class = c("tbl_df", "tbl", "data.frame"))
foo
#> date
#> 1 44061
#> 2 8/18/2020 2027
foo$date <- ifelse(grepl("^\\d+$", foo$date), format(as.Date(as.numeric(foo$date), origin = "1899-12-30"), "%m/%d/%Y"), foo$date)
#> Warning in as.Date(as.numeric(foo$date), origin = "1899-12-30"): NAs introduced
#> by coercion
foo
#> date
#> 1 08/18/2020
#> 2 8/18/2020 2027
Upvotes: 0