Reputation: 940
I have a data.frame with multiple columns that contain dates. At present they are recognised as "factor" class. I want to select all the columns that should be dates (there are 10 of them, they all have "Date" in their name, e.g. Date_Construc, Date_use, Comp_date...) and convert them from factor to date.
Here's what I've tried: First I want to select the relevant columns in a vector
library(tidyselect)
date_vars <- vars_select(names(df1), contains("Date", ignore.case = TRUE))
then
library(lubridate)
date_vars <- dmy(date_vars)
Also tried
date_vars <- vars_select(names(df1), contains("Date", ignore.case = TRUE))
df1[date_vars] <- lapply(df1[date_vars], as.Date)
I get
Error in as.Date.numeric(X[[i]], ...) : 'origin' must be supplied
Also
date_vars <- vars_select(names(df1), contains("Date", ignore.case = TRUE))
df1[date_vars] <- dmy(as.character(df1[date_vars])
with result,
Warning message:
All formats failed to parse. No formats found.
This is sample data in current format:
Date_Construct= c("10/03/2018 00:00", "21/03/2015 00:00", "20/02/2012 00:00")
Date_use = c("02/08/2007 00:00", "31/10/2007 00:00", "13/08/2008 00:00")
ID = c("0001", "34560", "100041531")
Comp = c("Revis", "Succ", "Revis")
dfq= data.frame(`ID`, `Date_Construct`, `Date_use`, `Comp`)
ID Date_Construct Date_use Comp
1 0001 10/03/2018 00:00 02/08/2007 00:00 Revis
2 34560 21/03/2015 00:00 31/10/2007 00:00 Succ
3 100041531 20/02/2012 00:00 13/08/2008 00:00 Revis
Upvotes: 0
Views: 567
Reputation: 866
Updated answer based on new data provided.
Try the following. There's no need to strip out the time component of the date-time string. You can parse it using the lubridate
function which matches the data (in this case, dmy_hm()
) then disregard it.
dfq_parsed <- dfq %>%
mutate(across(contains("date", ignore.case = TRUE), dmy_hm))
This yields:
ID Date_Construct Date_use Comp
1 0001 2018-03-10 2007-08-02 Revis
2 34560 2015-03-21 2007-10-31 Succ
3 100041531 2012-02-20 2008-08-13 Revis
Where the dates are as POSIXct, but that's easy enough to work with:
'data.frame': 3 obs. of 4 variables:
$ ID : chr "0001" "34560" "100041531"
$ Date_Construct: POSIXct, format: "2018-03-10" "2015-03-21" "2012-02-20"
$ Date_use : POSIXct, format: "2007-08-02" "2007-10-31" "2008-08-13"
$ Comp : chr "Revis" "Succ" "Revis"
Upvotes: 1