justinian482
justinian482

Reputation: 1075

Filter data frame colum by variable name

I have the following data frame:

df <- data.frame(
  gene = c("A","B","C","D","E","F","G","H","I","J"),
  pos.rank = c(1,2,3,4,5,6,7,8,9,10),
  neg.rank = c(10,9,8,7,6,5,4,3,2,1),
  stringsAsFactors=TRUE
)

I am trying to filter the data frame based on the values 1:3 in the pos.rank or neg.rank column, like this:

library(dplyr)
x <- "neg.rank"
y <- "pos.rank"

df.x <- df[df[x] %in% 1:3, ]
df.y <- df[df[y] %in% 1:3, ]

But both df.x and df.y are empty. When I run df[x] I get this output:

   neg.rank
1        10
2         9
3         8
4         7
5         6
6         5
7         4
8         3
9         2
10        1

What am I doing wrong?

Upvotes: 1

Views: 120

Answers (3)

GKi
GKi

Reputation: 39737

You can add a , during subsetting and make df[,x] instead of df[x] to get a vector instead of a data.frame to compare with %in%.

df.x <- df[df[,x] %in% 1:3,]
df.y <- df[df[,y] %in% 1:3, ]

df.x
#   gene pos.rank neg.rank
#8     H        8        3
#9     I        9        2
#10    J       10        1

df.y
#  gene pos.rank neg.rank
#1    A        1       10
#2    B        2        9
#3    C        3        8

Upvotes: 1

Ronak Shah
Ronak Shah

Reputation: 389335

Subsetting with [ returns a dataframe, you need to use [[ which will return a vector.

df[x] %in% 1:3
#[1] FALSE

df[[x]] %in% 1:3
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE  TRUE  TRUE  TRUE

To subset the data.

x <- "neg.rank"
y <- "pos.rank"
df[df[[x]] %in% 1:3 | df[[y]] %in% 1:3, ]

#   gene pos.rank neg.rank
#1     A        1       10
#2     B        2        9
#3     C        3        8
#8     H        8        3
#9     I        9        2
#10    J       10        1

If you want separate dataframes.

df.x <- df[df[[x]] %in% 1:3, ]
df.y <- df[df[[y]] %in% 1:3, ]

Upvotes: 3

AnilGoyal
AnilGoyal

Reputation: 26238

Actually you using conditions on a data.frame without giving its column/vector location. These will work

df.x <- df[df[x][,1] %in% 1:3, ]
df.y <- df[df[y][,1] %in% 1:3, ]

Upvotes: 1

Related Questions