Reputation: 1075
I have the following data frame:
df <- data.frame(
gene = c("A","B","C","D","E","F","G","H","I","J"),
pos.rank = c(1,2,3,4,5,6,7,8,9,10),
neg.rank = c(10,9,8,7,6,5,4,3,2,1),
stringsAsFactors=TRUE
)
I am trying to filter the data frame based on the values 1:3
in the pos.rank
or neg.rank
column, like this:
library(dplyr)
x <- "neg.rank"
y <- "pos.rank"
df.x <- df[df[x] %in% 1:3, ]
df.y <- df[df[y] %in% 1:3, ]
But both df.x
and df.y
are empty.
When I run df[x]
I get this output:
neg.rank
1 10
2 9
3 8
4 7
5 6
6 5
7 4
8 3
9 2
10 1
What am I doing wrong?
Upvotes: 1
Views: 120
Reputation: 39737
You can add a ,
during subsetting and make df[,x]
instead of df[x]
to get a vector instead of a data.frame to compare with %in%
.
df.x <- df[df[,x] %in% 1:3,]
df.y <- df[df[,y] %in% 1:3, ]
df.x
# gene pos.rank neg.rank
#8 H 8 3
#9 I 9 2
#10 J 10 1
df.y
# gene pos.rank neg.rank
#1 A 1 10
#2 B 2 9
#3 C 3 8
Upvotes: 1
Reputation: 389335
Subsetting with [
returns a dataframe, you need to use [[
which will return a vector.
df[x] %in% 1:3
#[1] FALSE
df[[x]] %in% 1:3
#[1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE TRUE TRUE TRUE
To subset the data.
x <- "neg.rank"
y <- "pos.rank"
df[df[[x]] %in% 1:3 | df[[y]] %in% 1:3, ]
# gene pos.rank neg.rank
#1 A 1 10
#2 B 2 9
#3 C 3 8
#8 H 8 3
#9 I 9 2
#10 J 10 1
If you want separate dataframes.
df.x <- df[df[[x]] %in% 1:3, ]
df.y <- df[df[[y]] %in% 1:3, ]
Upvotes: 3
Reputation: 26238
Actually you using conditions on a data.frame without giving its column/vector location. These will work
df.x <- df[df[x][,1] %in% 1:3, ]
df.y <- df[df[y][,1] %in% 1:3, ]
Upvotes: 1