Reputation: 2267
I have a dataframe like this.
ID <- c("111","111","111","111",
"113","113","113","113")
ToolID <- c("CCP_A","CCP_B","CCP_B","CCQ_A",
"CCP_A","CCP_B","CCP_B","CCQ_A")
Step <- c("Step_A","Step_B","Step_B","Step_C",
"Step_A","Step_B","Step_B","Step_C")
Datetime <- c("2018-04-19 12:44:11","2018-04-17 12:44:11","2018-04-17 12:44:11","2018-04-16 21:11:09",
"2018-04-13 12:44:11","2018-04-11 12:44:11","2018-04-11 12:44:11","2018-04-10 21:11:09")
df <- data.frame(ID,ToolID,Step,Datetime)
df$Datetime <- as.POSIXct(df$Datetime,format="%Y-%m-%d %H:%M:%S")
I am trying to create a new column "New.Datetime" that is the MAX(Datetime) grouped by ID only when ToolID == "CCP_B" & Step=="Step_B"
Here is my desired output
ID ToolID Step Datetime New.Datetime
111 CCP_A Step_A 2018-04-19 12:44:11 2018-04-17 12:44:11
111 CCP_B Step_B 2018-04-17 12:44:11 2018-04-17 12:44:11
111 CCP_B Step_B 2018-04-17 12:44:11 2018-04-17 12:44:11
111 CCQ_A Step_C 2018-04-16 21:11:09 2018-04-17 12:44:11
113 CCP_A Step_A 2018-04-13 12:44:11 2018-04-11 12:44:11
113 CCP_B Step_B 2018-04-11 12:44:11 2018-04-11 12:44:11
113 CCP_B Step_B 2018-04-11 12:44:11 2018-04-11 12:44:11
113 CCQ_A Step_C 2018-04-10 21:11:09 2018-04-11 12:44:11
I am trying to do it this way but the new.datetime seems messy with numeric values.
library(dplyr)
df1 <- df %>%
group_by(ID) %>%
mutate(New.Datetime = ifelse((ToolID == "CCP_B" & Step=="Step_B"),
max(Datetime), Datetime))
Upvotes: 1
Views: 58
Reputation: 887991
We can use case_when
which will work as well
df %>%
group_by(ID) %>%
mutate(New.DateTime = case_when(ToolID == "CCP_B" &
Step == "Step_B" ~ max(Datetime),
TRUE ~ Datetime))
# A tibble: 8 x 5
# Groups: ID [2]
# ID ToolID Step Datetime New.DateTime
# <fct> <fct> <fct> <dttm> <dttm>
#1 111 CCP_A Step_A 2018-04-19 12:44:11 2018-04-19 12:44:11
#2 111 CCP_B Step_B 2018-04-17 12:44:11 2018-04-19 12:44:11
#3 111 CCP_B Step_B 2018-04-17 12:44:11 2018-04-19 12:44:11
#4 111 CCQ_A Step_C 2018-04-16 21:11:09 2018-04-16 21:11:09
#5 113 CCP_A Step_A 2018-04-13 12:44:11 2018-04-13 12:44:11
#6 113 CCP_B Step_B 2018-04-11 12:44:11 2018-04-13 12:44:11
#7 113 CCP_B Step_B 2018-04-11 12:44:11 2018-04-13 12:44:11
#8 113 CCQ_A Step_C 2018-04-10 21:11:09 2018-04-10 21:11:09
As the OP clarified in the comments to have the max
value based on the condition for ecah 'ID', we just neeed
df %>%
group_by(ID) %>%
mutate(New.DateTime = max(Datetime[ToolID == "CCP_B" & Step == "Step_B"]))
# A tibble: 8 x 5
# Groups: ID [2]
# ID ToolID Step Datetime New.DateTime
# <fct> <fct> <fct> <dttm> <dttm>
#1 111 CCP_A Step_A 2018-04-19 12:44:11 2018-04-17 12:44:11
#2 111 CCP_B Step_B 2018-04-17 12:44:11 2018-04-17 12:44:11
#3 111 CCP_B Step_B 2018-04-17 12:44:11 2018-04-17 12:44:11
#4 111 CCQ_A Step_C 2018-04-16 21:11:09 2018-04-17 12:44:11
#5 113 CCP_A Step_A 2018-04-13 12:44:11 2018-04-11 12:44:11
#6 113 CCP_B Step_B 2018-04-11 12:44:11 2018-04-11 12:44:11
#7 113 CCP_B Step_B 2018-04-11 12:44:11 2018-04-11 12:44:11
#8 113 CCQ_A Step_C 2018-04-10 21:11:09 2018-04-11 12:44:11
Upvotes: 3