spidermarn
spidermarn

Reputation: 939

R How to transform values to most recent value

I have df which looks like this:

Customer| Code  |Visit|
    A   |  A    | 2018|
    A   |  B    | 2019| 
    B   |  C    | 2017|
    B   |  D    | 2018|

How do I transform Column Code to reflect the most recent Code on a customer lvl?

Expected Result:

Customer| Code  |Visit|
    A   |  B    | 2018|
    A   |  B    | 2019| 
    B   |  D    | 2017|
    B   |  D    | 2018|

I tried this but it doesn't work:

df %>% group_by(Customer)%>% mutate(Code = max(Visit, Code))

Upvotes: 0

Views: 96

Answers (3)

akrun
akrun

Reputation: 887981

An option with base R

df$Code <-  with(df,  Code[rep(which(ave(Visit, Customer, FUN = max)
             == Visit), table(Customer))])
df$Code
#[1] "B" "B" "D" "D"

Upvotes: 0

polkas
polkas

Reputation: 4194

More efficient solution because of pre-sorting the data.frame (so we subsetting always the last observation):

library(data.table)

result <- setDT(df)[order(Customer, Visit)][,.SD[.N],by=Customer]

It might be important for bigger datasets.

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389335

You can use which.max to get index of max value in Visit and extract the corresponding Code.

library(dplyr)

df %>% group_by(Customer) %>% mutate(Code = Code[which.max(Visit)])

# Customer Code  Visit
#  <chr>    <chr> <int>
#1 A        B      2018
#2 A        B      2019
#3 B        D      2017
#4 B        D      2018

Upvotes: 1

Related Questions