Vaibhav Singh
Vaibhav Singh

Reputation: 1209

Get Max value by category as a new column in R

I understand the questions wording is not clear but its a simple one, hopefully the image will convey more than words. I need a new column (New Col in image) which gets a value from column B corresponding to max value in column N (by group) in this case A.

Preferably tidyverse solution since I am piping a long command.

   df <-  structure(list(A = c("a", "a", "a", "a", "a", "b", "b", "b"), 
        B = c("b", "c", "d", "e", "f", "c", "d", "e"), N = c(1L, 
        2L, 3L, 4L, 5L, 5L, 4L, 3L), New.Col = c("f", "f", "f", "f", 
        "f", "c", "c", "c")), class = "data.frame", row.names = c(NA, 
    -8L))

Output

Upvotes: 2

Views: 1137

Answers (3)

TarJae
TarJae

Reputation: 78927

Here is a clean and neat version of my answer without ifelse:

library(dplyr)

df %>% 
  group_by(A) %>% 
  mutate(new_col = B[N==max(N)]) 

  A     B         N New.Col new_col
  <chr> <chr> <int> <chr>   <chr>  
1 a     b         1 f       f      
2 a     c         2 f       f      
3 a     d         3 f       f      
4 a     e         4 f       f      
5 a     f         5 f       f      
6 b     c         5 c       c      
7 b     d         4 c       c      
8 b     e         3 c       c  

First answer: We could use max in an ifelse statement:

library(dplyr)
df %>%
  group_by(A) %>% 
  mutate(NewCol = ifelse(N==max(N), B, B)) %>% 
  select(-NewCol)
  A     B         N New.Col
  <chr> <chr> <int> <chr>  
1 a     b         1 f      
2 a     c         2 f      
3 a     d         3 f      
4 a     e         4 f      
5 a     f         5 f      
6 b     c         5 c      
7 b     d         4 c      
8 b     e         3 c 

Upvotes: 3

akrun
akrun

Reputation: 887118

Using data.table

library(data.table)
 setDT(df)[, new_col := B[which.max(N)], A]
> df
   A B N New.Col new_col
1: a b 1       f       f
2: a c 2       f       f
3: a d 3       f       f
4: a e 4       f       f
5: a f 5       f       f
6: b c 5       c       c
7: b d 4       c       c
8: b e 3       c       c

Upvotes: 3

AEF
AEF

Reputation: 5650

You can just use normal group_by and mutate

library(tidyverse)

df <-  structure(list(A = c("a", "a", "a", "a", "a", "b", "b", "b"), 
                      B = c("b", "c", "d", "e", "f", "c", "d", "e"), N = c(1L, 
                                                                           2L, 3L, 4L, 5L, 5L, 4L, 3L), New.Col = c("f", "f", "f", "f", 
                                                                                                                    "f", "c", "c", "c")), class = "data.frame", row.names = c(NA, 
                                                                                                                                                                              -8L))

df %>% 
  group_by(A) %>% 
  mutate(new_col = B[which.max(N)]) %>% 
  ungroup()

Upvotes: 2

Related Questions