Determine the max values in one column based on another column

Question

I would like to determine the max the values in column value1 based the values in column value2:

df <- 'inf value1 value2
       xx1   20    30
       xx2   15    40
       xx3   25    20'
df <- read.table(text=df, header=T)

My expected output would be like that:

out <- 'inf value1 value2
       xx1   20    30
       xx2   15    40
       xx3   20    20'
out <- read.table(text=out, header=T)

In the third row in the column value2 now I have 20 instead 25, because this is the value at value2column at the same row. I have a large dataset, I would appreciate any ideas to deal with that.

David Arenburg · Accepted Answer

You could also find the min value per row using pmin

with(df, pmin(value1, value2))
## [1] 20 15 20

Some benchmarks

set.seed(123)
test1 <- sample(1e3, 1e8, replace = TRUE)
test2 <- sample(1e3, 1e8, replace = TRUE)

### My solution
system.time(res1 <- pmin(test1, test2)) 
# user  system elapsed 
# 2.87    0.11    3.00 

### @Avinash
system.time(res2 <- ifelse(test1 < test2, test1, test2))
# user  system elapsed 
# 16.33    2.41   18.87 

### Contributed by @Colonel
system.time({temp <- test1 > test2 ; test1[temp] <- test2[temp]}) 
# user  system elapsed 
# 2.34    0.29    2.63 

identical(res1, res2)
# [1] TRUE
identical(res1, test1)
# [1] TRUE

Determine the max values in one column based on another column

Answers (2)

Related Questions