Reputation: 21
I want to keep the string while using .SD, max .
data <- data.table(id = c("a", "a", "b", "c"),
s1 = c(1, 3, 2, 2),
s2 = c(3, 1, 1, 0),
s3 = c(5, 3, 0, 2),
ta = c("ba", "bb", "cc", "dd"))
out_data <- data[, lapply(.SD, max), by=id]
Desired output:
id s1 s2 s3 ta
1: a 0 3 5 ba
2: a 3 0 0 bb
3: b 2 1 0 cc
4: c 2 0 2 dd
How can I keep ta information according to id?
Upvotes: 2
Views: 113
Reputation: 56179
Check if the value equals to max or if it is a character:
data[, lapply(.SD, function(x) ifelse(x == max(x) | is.character(x), x, 0)), by = id]
# id s1 s2 s3 ta
# 1: a 0 3 5 ba
# 2: a 3 0 0 bb
# 3: b 2 1 0 cc
# 4: c 2 0 2 dd
Upvotes: 2
Reputation: 2783
The best solution I can think of is this:
colList <- c("s1", "s2", "s3")
out_data <- data[, (colList) := lapply(.SD, function(x) ifelse(x == max(x), x, 0)), by=.(id)]
There is no reason to specify .SDcols
in this construction. If you were to remove the colList
portion and simply used .SDcols
it would remove the ta
column altogether.
Edit: as @zx8754 correctly points out this will alter the state of data as well, as R will first execute what's to the right of the <-
and afterwards assign it to out_data
. You can prevent this by doing:
colList <- c("s1", "s2", "s3")
out_data <- copy(data)[, (colList) := lapply(.SD, function(x) ifelse(x == max(x), x, 0)), by=.(id)]
Upvotes: 2
Reputation: 52049
A dplyr
solution:
df %>%
group_by(id) %>%
mutate(across(c(s1:s3), ~ ifelse(.x == max(.x), max(.x), 0)))
id s1 s2 s3 ta
<chr> <dbl> <dbl> <dbl> <chr>
1 a 0 3 5 ba
2 a 3 0 0 bb
3 b 2 1 0 cc
4 c 2 0 2 dd
Upvotes: 0