ghgh
ghgh

Reputation: 21

How to calculate maximum value including string?

I want to keep the string while using .SD, max .

data <- data.table(id = c("a", "a", "b", "c"),
                   s1 = c(1, 3, 2, 2),
                   s2 = c(3, 1, 1, 0),
                   s3 = c(5, 3, 0, 2),
                   ta = c("ba", "bb", "cc", "dd"))

out_data <- data[, lapply(.SD, max), by=id]

Desired output:

   id s1 s2 s3 ta
1:  a  0  3  5 ba
2:  a  3  0  0 bb
3:  b  2  1  0 cc
4:  c  2  0  2 dd

How can I keep ta information according to id?

Upvotes: 2

Views: 113

Answers (3)

zx8754
zx8754

Reputation: 56179

Check if the value equals to max or if it is a character:

data[, lapply(.SD, function(x) ifelse(x == max(x) | is.character(x), x, 0)), by = id]
#    id s1 s2 s3 ta
# 1:  a  0  3  5 ba
# 2:  a  3  0  0 bb
# 3:  b  2  1  0 cc
# 4:  c  2  0  2 dd

Upvotes: 2

koolmees
koolmees

Reputation: 2783

The best solution I can think of is this:

colList <- c("s1", "s2", "s3")
out_data <- data[, (colList) := lapply(.SD, function(x) ifelse(x == max(x), x, 0)), by=.(id)]

There is no reason to specify .SDcols in this construction. If you were to remove the colList portion and simply used .SDcols it would remove the ta column altogether.

Edit: as @zx8754 correctly points out this will alter the state of data as well, as R will first execute what's to the right of the <- and afterwards assign it to out_data. You can prevent this by doing:

colList <- c("s1", "s2", "s3")
out_data <- copy(data)[, (colList) := lapply(.SD, function(x) ifelse(x == max(x), x, 0)), by=.(id)]

Upvotes: 2

Ma&#235;l
Ma&#235;l

Reputation: 52049

A dplyr solution:

df %>% 
  group_by(id) %>% 
  mutate(across(c(s1:s3), ~ ifelse(.x == max(.x), max(.x), 0)))

  id       s1    s2    s3 ta   
  <chr> <dbl> <dbl> <dbl> <chr>
1 a         0     3     5 ba   
2 a         3     0     0 bb   
3 b         2     1     0 cc   
4 c         2     0     2 dd 

Upvotes: 0

Related Questions