Joep_S
Joep_S

Reputation: 537

Add original values for columns after group by

For the dataframe below I want to add the original values for Var_x after a group_by on ID and event and a max() on quest, but I cannot get my code right. Any suggestions? By the way, in my original dataframe more than 1 column needs to be added.

df <- data.frame(ID = c(1,1,1,1,1,1,2,2,2,3,3,3),
                 quest = c(1,1,2,2,3,3,1,2,3,1,2,3),
                 event = c("A","B","A","B","A",NA,"C","D","C","D","D",NA),
                 VAR_X = c(2,4,3,6,3,NA,6,4,5,7,5,NA))

Code:

df %>%
  group_by(ID,event) %>%
  summarise(quest = max(quest))

Desired output:

  ID quest event VAR_X
1  1     2     B     6
2  1     3     A     3
3  2     2     D     4
4  2     3     C     5
5  3     2     D     5

Upvotes: 2

Views: 65

Answers (2)

r.user.05apr
r.user.05apr

Reputation: 5456

df %>%
  drop_na() %>% # remove if necessary ..
  group_by(ID, event) %>%
  filter(quest == max(quest)) %>%
  ungroup()

# A tibble: 5 x 4
# ID quest event VAR_X
#<dbl> <dbl> <chr> <dbl>
# 1     1     2 B         6
# 2     1     3 A         3
# 3     2     2 D         4
# 4     2     3 C         5
# 5     3     2 D         5

Upvotes: 3

Rui Barradas
Rui Barradas

Reputation: 76402

Start by omiting the na values and in the end do an inner_join with the original data set.

df %>%
  na.omit() %>%
  group_by(ID, event) %>%
  summarise(quest = max(quest)) %>%
  inner_join(df, by = c("ID", "event", "quest"))
## A tibble: 5 x 4
## Groups:   ID [3]
#     ID event quest VAR_X
#  <dbl> <fct> <dbl> <dbl>
#1     1 A         3     3
#2     1 B         2     6
#3     2 C         3     5
#4     2 D         2     4
#5     3 D         2     5

Upvotes: 4

Related Questions