Albin
Albin

Reputation: 912

Modify the date in a data frame in R

Recently I stumble over a problem. Unfortunately my variable for the date has not been recorded uniformly.

I got a similar data frame like the one shown below

Variable1 <- c(10,20,30,40,50)
Variable2 <- c("a", "b", "c", "d", "d")
Date <- c("today 10:45", "yesterday 3:10", "28 october 2018 5:32", "28 october 2018 8:32", "27 october 2018 5:32")
df <- data.frame(Variable1, Variable2, Date)
df

For my use I need to extract only the date of it. Therefore, I would like to create a new variable based on "Date".

The Date variable should only contain the date. The hour is irrelevant for my purpose and can be ignored.

My goal is to get the following data frame:

Variable1 <- c(10,20,30,40,50)
Variable2 <- c("a", "b", "c", "d", "d")
Date <- c("31 october 2018", "30 october 2018", "28 october 2018", "28 october 2018", "27 october 2018")
df2 <- data.frame(Variable1, Variable2, Date)
df2

Preferably the values for Date should also be in the correct format (date).

Thank you already in advance.

Upvotes: 1

Views: 725

Answers (3)

Rui Barradas
Rui Barradas

Reputation: 76661

Another solution, using indices.

Date <- c("today 10:45", "yesterday 3:10", "28 october 2018 5:32", "28 october 2018 8:32", "27 october 2018 5:32")

Date <- sub("today", Sys.Date(), Date)
Date <- sub("yesterday", Sys.Date() - 1, Date)
i <- grep("[[:alpha:]]", Date)
Date[i] <- format(as.POSIXct(Date[i], format = "%d %B %Y %H:%M"), format = "%d %B %Y")
Date[-i] <- format(as.POSIXct(Date[-i]), format = "%d %B %Y")

Date
#[1] "31 October 2018" "30 October 2018" "28 October 2018"
#[4] "28 October 2018" "27 October 2018"

Then I noticed the solution by user r2evans, that converts everything to lowercase. So, if necessary, end with

Date <- tolower(Date)

Upvotes: 0

iod
iod

Reputation: 7592

df$NewDate[grepl("today",df$Date)]<-Sys.Date() # Convert today to date
df$NewDate[grepl("yesterday",df$Date)]<-Sys.Date()-1  # covert yesterday to date
df$NewDate[is.na(df$NewDate)]<-df$Date[is.na(df$NewDate)] %>% as.Date(format="%d %b %Y")  # Convert explicit dates to date format
class(df$NewDate)<-"Date"  # Convert column to Date class

df
  Variable1 Variable2                 Date    NewDate
1        10         a          today 10:45 2018-10-31
2        20         b       yesterday 3:10 2018-10-30
3        30         c 28 october 2018 5:32 2018-10-28
4        40         d 28 october 2018 8:32 2018-10-28
5        50         d 27 october 2018 5:32 2018-10-27

Upvotes: 1

r2evans
r2evans

Reputation: 160952

tolower(                                               # not strictly necessary, but for consistency
  gsub("yesterday", format(Sys.Date()-1, "%d %B %Y"),  # convert *day to dates
       gsub("today", format(Sys.Date(), "%d %B %Y"),
            gsub("\\s*[0-9:]*$", "",                   # remove the times
                 c("today 10:45", "yesterday 3:10", "28 october 2018 5:32", "28 october 2018 8:32", "27 october 2018 5:32")))))
# [1] "31 october 2018" "30 october 2018" "28 october 2018" "28 october 2018" "27 october 2018"

Upvotes: 1

Related Questions