Reputation: 8019
I am on the lookout for a function that would return the rows in a dataframe mydata
mydata=data.frame(group1=c(rep("MALE",6),rep("FEMALE",6)),group2=c(rep("TREATED",3),rep("UNTREATED",3)))
mydata
group1 group2
1 MALE TREATED
2 MALE TREATED
3 MALE TREATED
4 MALE UNTREATED
5 MALE UNTREATED
6 MALE UNTREATED
7 FEMALE TREATED
8 FEMALE TREATED
9 FEMALE TREATED
10 FEMALE UNTREATED
11 FEMALE UNTREATED
12 FEMALE UNTREATED
for which columns are equal to particular factor levels, specified as a list
selection=list(group1="MALE",group2="TREATED")
In this example, this function would return a vector of selected rows
c(1,2,3)
What would be the easiest and fastest way to do this, without using loops etc?
PS The list selection
could be of any length, and there could be any number of columns in my dataframe of any name.
(I know subset
, but this is not quite what I am looking for)
EDIT: A function I just made to do the above is the following, but it is not elegant, so I was just wondering if there are already any built-in functions to do what I want :
mydata=data.frame(group1=c(rep("MALE",6),rep("FEMALE",6)),group2=c(rep("TREATED",3),rep("UNTREATED",3)))
selection=list(group1="MALE",group2="TREATED")
selrows=function(mydata,selection) {
nms=names(selection)
sel=data.frame(matrix(TRUE,nrow=nrow(mydata),ncol=length(nms)))
for (i in 1:length(nms)) { sel[,i]=(mydata[,nms[[i]]]==selection[nms[[i]]][[1]]) }
which(apply(sel*1,1,prod)==1)
}
selrows(mydata,selection)
1 2 3
Upvotes: 1
Views: 1124
Reputation: 23818
Maybe this helps:
which(mydata[,1] %in% unlist(selection) & mydata[,2] %in% unlist(selection))
#[1] 1 2 3
Upvotes: 1