Reputation: 197
I have a dataframe df1
and a list l1
like this :
df1 <- data.frame(c1 = c(4.2, 1.2, 3.0) , c2 = c(2.3, 1.8, 12.0 ) ,c3 = c(1.2, 3.2, 2.0 ) , c4 = c(2.2, 1.9, 0.9) )
l1 <- list(x1 = c(2,4) ,x2 = c(3) ,x3 = c(2))
Where l1 contains the list of indices to ignore in df1. Now, I want to find the indices of top 2 (can be higher) elements after excluding the indices in list l1 for every row. Actual data has much more rows and columns. So, the expected output is :
[1,] 1 3
[2,] 2 4
[3,] 1 3
where df1 :
c1 c2 c3 c4
1 4.2 2.3 1.2 2.2
2 1.2 1.8 3.2 1.9
3 3.0 12.0 2.0 0.9
If the indices can be in the order of the values of their placeholders, that would also be helpful. Then the expected output would be :
[1,] 1 3
[2,] 4 2
[3,] 1 3
Upvotes: 2
Views: 54
Reputation: 887118
We can use rank
lapply(seq_len(nrow(df1)), function(i) {
x1 <- unlist(df1[i,])
i2 <- l1[[i]]
i3 <- seq_along(x1) %in% i2
which(rank(-x1*NA^i3) %in% 1:2) })
#[[1]]
#[1] 1 3
#[[2]]
#[1] 2 4
#[[3]]
#[1] 1 3
If we need it in order
lapply(seq_len(nrow(df1)), function(i) {
x1 <- unlist(df1[i,])
i2 <- l1[[i]]
i3 <- seq_along(x1) %in% i2
i4 <- which(rank(-x1*NA^i3) %in% 1:2)
i4[order(-x1[i4])]
})
#[[1]]
#[1] 1 3
#[[2]]
#[1] 4 2
#[[3]]
#[1] 1 3
Upvotes: 3
Reputation: 1099
Also using rank but returning a matrix. Syntax is made a little ugly by t()
converting the data.frame into a matrix
df1 <- data.frame(c1 = c(4.2, 1.2, 3.0) , c2 = c(2.3, 1.8, 12.0 ) ,c3 = c(1.2, 3.2, 2.0 ) , c4 = c(2.2, 1.9, 0.9) )
l1 <- list(x1 = c(2,4) ,x2 = c(3) ,x3 = c(2))
indexOrderSub <- function( df , excl , top = 2) {
z <- 1:length(df)
sel <- !( z %in% excl )
rz <- z[ sel ]
rz2 <- tail( rz[order( rank(df)[ sel ] )] , top)
rz2[order(rz2)]
}
t( mapply( indexOrderSub , as.data.frame(t(df1)) , l1))
Upvotes: 1
Reputation: 3879
I understand the question as follows. For each row i
of df1
, exclude the elements with number l1[i]
and then give the indices of the largest two remaining elements.
highest.two <- function(x){
first.highest_position <- which.max(x)
second.highest_value <- max(x[-first.highest_position])
second.highest_position <- which(x == second.highest_value)
return(c(first.highest_position, second.highest_position))
}
ret <- matrix(NA, nrow = nrow(df1), ncol = 2)
for(i in 1:nrow(df1)){
tmp <- df1[i, ]
tmp[l1[i][[1]]] <- -Inf
ret[i, ] <- highest.two(tmp) #if you want to have these indices ordered use sort(highest.two(tmp))
}
ret
Upvotes: 2