Reputation: 1769
I have the following matrix
> mat<-rbind(c(9,6),c(10,6),c(11,7),c(12,7),c(12,8),c(12,9),c(12,10),c(12,11),c(12,12),c(13,12))
> mat
[,1] [,2]
[1,] 9 6
[2,] 10 6
[3,] 11 7
[4,] 12 7
[5,] 12 8
[6,] 12 9
[7,] 12 10
[8,] 12 11
[9,] 12 12
[10,] 13 12
I would like to remove duplicate rows based on first column values and store the row whose entry in the second column is maximum. E.g. for the example above, the desidered outcome is
[,1] [,2]
[1,] 9 6
[2,] 10 6
[3,] 11 7
[4,] 12 12
[5,] 13 12
I tried with
> mat[!duplicated(mat[,1]),]
but I obtained
[,1] [,2]
[1,] 9 6
[2,] 10 6
[3,] 11 7
[4,] 12 7
[5,] 13 12
which is different from the desidered outcome for the entry [4,2]. Suggestions?
Upvotes: 0
Views: 55
Reputation: 895
First Sort then keep only the first row for each duplicate
mat <- mat[order(mat[,1], mat[,2]),]
mat[!duplicated(mat[,1]),]
EDIT: Sorry I thought your desired result is last df,Ok so you want max value
mat<-rbind(c(9,6),c(10,6),c(11,7),c(12,7),c(12,8),c(12,9),c(12,10),c(12,11),c(12,12),c(13,12))
#Reverse sort
mat <- mat[order(mat[,1], mat[,2], decreasing=TRUE),]
#Keep only the first row for each duplicate, this will give the largest values
mat <- mat[!duplicated(mat[,1]),]
#finally sort it
mat <- mat[order(mat[,1], mat[,2]),]
Upvotes: 1
Reputation: 8836
Like Josephs solution, but if you add row names first you can keep the original order (which will be the same in this case).
rownames(mat) <- 1:nrow(mat)
mat <- mat[order(mat[,2], -mat[,2]),]
mat <- mat[!duplicated(mat[,1]),]
mat[order(as.numeric(rownames(mat))),]
# [,1] [,2]
# 1 9 6
# 2 10 6
# 3 11 7
# 4 12 12
# 5 13 12
Upvotes: 1
Reputation: 326
You can sort the matrix first, using ascending order for column 1 and descending order for column 2. Then the duplicated function will remove all but the maximum column 2 value for each column 1 value.
mat <- mat[order(mat[,1],-mat[,2]),]
mat[!duplicated(mat[,1]),]
[,1] [,2]
[1,] 9 6
[2,] 10 6
[3,] 11 7
[4,] 12 12
[5,] 13 12
Upvotes: 3