Diego Gonzalez Avalos
Diego Gonzalez Avalos

Reputation: 138

Get the most recent observation using date variable in r

im using R to create a table with data from another table and i'm working with the next variables:

-PRODUCT ID 
-CLASIFICATION
-DATE

For example my origin table:

product id   Clasification   Date
10000567        B+         12-12-2020
10000123        C+         26-11-2020
10000567        A+         02-11-2020
10000222        A+         09-10-2020
10000123        B++        21-09-2020
10000222        A++        10-09-2020

The thing is that i need to get the most recently clasification for my products id's cause is a dynamic field and it can change always. One row for product id.

Any help will be great.

Thanks!

Upvotes: 2

Views: 1746

Answers (2)

Darren Tsai
Darren Tsai

Reputation: 35554

You can use slice_max() in dplyr, which supersedes top_n() after version 1.0.0, to select the most recent date.

df %>%
  mutate(Date = as.Date(Date, "%d-%m-%Y")) %>%
  group_by(product_id) %>%
  slice_max(Date, n = 1) %>% 
  ungroup()

# # A tibble: 3 x 3
#   product_id Clasification Date      
#        <int> <chr>         <date>    
# 1   10000123 C+            2020-11-26
# 2   10000222 A+            2020-10-09
# 3   10000567 B+            2020-12-12

Data

df <- structure(list(product_id = c(10000567L, 10000123L, 10000567L, 
10000222L, 10000123L, 10000222L), Clasification = c("B+", "C+", 
"A+", "A+", "B++", "A++"), Date = c("12-12-2020", "26-11-2020", 
"02-11-2020", "09-10-2020", "21-09-2020", "10-09-2020")), class = "data.frame", row.names = c(NA, -6L))

Upvotes: 3

Robert Wilson
Robert Wilson

Reputation: 3397

Assuming your dates are not sorted, something like the following should work:

library(dplyr)
df %>%
 arrange(desc(Date)) %>%
 group_by(id) %>%
 slice(1) %>%
 ungroup()

Upvotes: -1

Related Questions