Zachary McClean
Zachary McClean

Reputation: 11

Finding the mean of a group, based on the most recent date of each group

Sorry for the unclear title. The question I have is simple but difficult to articulate. If I have the sample data set:

Person Date (m/d/y) Weight
Person1 01/15/21 93
Person2 01/16/21 87
Person3 01/14/21 73
Person1 01/17/21 95
Person2 01/15/21 85
Person3 01/18/21 73.5

In R how do I find the average of Person1,2,3 weights. Keeping in mind I only their most recent weight is important.

Therefore, the correct answer should be:

Mean = 85.2

Upvotes: 0

Views: 37

Answers (2)

akrun
akrun

Reputation: 887203

An option is to do a group by slice on the last date and then take the mean

library(dplyr)
df1 %>%
  group_by(Person) %>%
  slice(which.max(as.Date(`Date (m/d/y)`, '%m/%d/%y'))) %>%
  ungroup %>%
  summarise(Weight = mean(Weight, na.rm = TRUE))

-output

# A tibble: 1 x 1
#  Weight
#   <dbl>
#1   85.2

data

df1 <- structure(list(Person = c("Person1", "Person2", "Person3", "Person1", 
"Person2", "Person3"), `Date (m/d/y)` = c("01/15/21", "01/16/21", 
"01/14/21", "01/17/21", "01/15/21", "01/18/21"), Weight = c(93, 
87, 73, 95, 85, 73.5)), class = "data.frame", row.names = c(NA, 
-6L))

Upvotes: 0

ThomasIsCoding
ThomasIsCoding

Reputation: 101638

Here is a data.table option

setDT(df)[
  ,
  Weight[which.max(as.Date(`Date (m/d/y)`, format = "%m/%d/%y"))],
  Person
][
  ,
  mean(V1)
]

gives

[1] 85.16667

Data

> dput(df)
structure(list(Person = c("Person1", "Person2", "Person3", "Person1",
"Person2", "Person3"), `Date (m/d/y)` = c("01/15/21", "01/16/21",
"01/14/21", "01/17/21", "01/15/21", "01/18/21"), Weight = c(93,
87, 73, 95, 85, 73.5)), class = "data.frame", row.names = c(NA,
-6L))

Upvotes: 0

Related Questions