Jack
Jack

Reputation: 85

Deleting rows in a data frame based on the contents of the rows

If I have a code like the following:

x1 <- list(1,2,3,4,5,5)
x2 <- list(1,4,7,8)
x3 <- list(5,6)
x4 <- list(1,4,4,5,6,7)
x5 <- list(1,2,3,5,6,9)
x6 <- list(1,4, 6,7,8,7)

myList <- list(x1, x2, x3, x4,x5,x6)

df <- data.frame(t(sapply(myList, function(x){c(x, rep(tail(x, 1),max(lengths(myList)) - length(x)))
})))

Which gives a data frame like this

  X1 X2 X3 X4 X5 X6
1  1  2  3  4  5  5
2  1  4  7  8  8  8
3  5  6  6  6  6  6
4  1  4  4  5  6  7
5  1  2  3  5  6  9
6  1  4  6  7  8  7

How could I delete the 2 rows that have the highest values of X6 and the 2 rows that have the lowest values of X6.

Upvotes: 3

Views: 74

Answers (3)

989
989

Reputation: 12935

Try this (I updated my answer based on your updated sample df):

o <- order(unlist(df[names(df)[ncol(df)]]))
df[-c(head(o, 2), tail(o, 2)),]

#  X1 X2 X3 X4 X5 X6
#4  1  4  4  5  6  7
#6  1  4  6  7  8  7

names(df)[ncol(df)] gives the name of the right most column in df.

Upvotes: 3

LyzandeR
LyzandeR

Reputation: 37889

In baseR, using subsetting with [:

#function sort sorts the df$X6 vector which we subset for the two highest and lowest values
mycol <- df[[rev(names(df))[1]]]
df[!mycol %in% c(sort(mycol)[1:2], rev(sort(mycol))[1:2]), ]
#  X1 X2 X3 X4 X5 X6
#4  1  4  4  5  6  7
#6  1  4  6  7  8  7

Upvotes: 3

MKR
MKR

Reputation: 20095

In base r few simple steps can be used to arrived desired data.

# Data is:
#   X1 X2 X3 X4 X5 X6
#1  1  2  3  4  5  5
#2  1  4  7  8  8  8
#3  5  6  6  6  6  6
#4  1  4  4  5  6  7
#5  1  2  3  5  6  9
#6  1  4  6  7  8  7

#order on X6
df <- df[order(df$X6),]
# > df
# X1 X2 X3 X4 X5 X6
# 1  2  3  4  5  5
# 5  6  6  6  6  6
# 1  4  4  5  6  7
# 1  4  6  7  8  7
# 1  4  7  8  8  8
# 1  2  3  5  6  9
#Remove top 2 rows
df <- tail(df, nrow(df) - 2)

#Remove bottom 2 (highest) value one. 
> df <- head(df, nrow(df) - 2)
#The result
# > df
# X1 X2 X3 X4 X5 X6
# 1  4  4  5  6  7
# 1  4  6  7  8  7

Upvotes: 2

Related Questions