Reputation: 939
I have df which looks like this:
Customer| Code |Visit|
A | A | 2018|
A | B | 2019|
B | C | 2017|
B | D | 2018|
How do I transform Column Code to reflect the most recent Code on a customer lvl?
Expected Result:
Customer| Code |Visit|
A | B | 2018|
A | B | 2019|
B | D | 2017|
B | D | 2018|
I tried this but it doesn't work:
df %>% group_by(Customer)%>% mutate(Code = max(Visit, Code))
Upvotes: 0
Views: 96
Reputation: 887981
An option with base R
df$Code <- with(df, Code[rep(which(ave(Visit, Customer, FUN = max)
== Visit), table(Customer))])
df$Code
#[1] "B" "B" "D" "D"
Upvotes: 0
Reputation: 4194
More efficient solution because of pre-sorting the data.frame (so we subsetting always the last observation):
library(data.table)
result <- setDT(df)[order(Customer, Visit)][,.SD[.N],by=Customer]
It might be important for bigger datasets.
Upvotes: 1
Reputation: 389335
You can use which.max
to get index of max value in Visit
and extract the corresponding Code
.
library(dplyr)
df %>% group_by(Customer) %>% mutate(Code = Code[which.max(Visit)])
# Customer Code Visit
# <chr> <chr> <int>
#1 A B 2018
#2 A B 2019
#3 B D 2017
#4 B D 2018
Upvotes: 1