km5041
km5041

Reputation: 361

R: Build Apply function to find minimum of columns based on conditions in other (related) columns

With data as such below, I'm trying to reassign any of the test cols (test_A, etc.) to their corresponding time cols (time_A, etc.) if the test is true, and then find the minimum of all true test times.

     [ID] [time_A] [time_B] [time_C] [test_A] [test_B] [test_C] [min_true_time]
[1,]    1        2        3        4    FALSE     TRUE     FALSE          ?
[2,]    2       -4        5        6     TRUE     TRUE     FALSE          ?
[3,]    3        6        1       -2     TRUE     TRUE      TRUE          ?
[4,]    4       -2        3        4     TRUE    FALSE     FALSE          ?

My actual data set is quite large so my attempts at if and for loops have failed miserably. But I can't make any progress on an apply function.

And more negative time, say -2 would be considered the minimum for row 3.

Any suggestions are welcomed gladly

Upvotes: 0

Views: 3107

Answers (1)

Roland
Roland

Reputation: 132989

You don't give much information, but I think this does what you need. No idea if it is efficient enough, since you don't say how big your dataset actually is.

#I assume your data is in a data.frame:
df <- read.table(text="ID time_A time_B time_C test_A test_B test_C 
1    1        2        3        4    FALSE     TRUE     FALSE
2    2       -4        5        6     TRUE     TRUE     FALSE
3    3        6        1       -2     TRUE     TRUE      TRUE
4    4       -2        3        4     TRUE    FALSE     FALSE")


#loop over all rows and subset column 2:4 with column 5:7, then take the mins
df$min_true_time <- sapply(1:nrow(df), function(i) min(df[i,2:4][unlist(df[i,5:7])]))
df
#  ID time_A time_B time_C test_A test_B test_C min_true_time
#1  1      2      3      4  FALSE   TRUE  FALSE             3
#2  2     -4      5      6   TRUE   TRUE  FALSE            -4
#3  3      6      1     -2   TRUE   TRUE   TRUE            -2
#4  4     -2      3      4   TRUE  FALSE  FALSE            -2

Another way, which might be faster (I'm not in the mood for benchmarking):

m <- as.matrix(df[,2:4])
m[!df[,5:7]] <- NA
df$min_true_time <- apply(m,1,min,na.rm=TRUE)

Upvotes: 1

Related Questions