Reputation: 341
How do I go about searching a data.frame
based on multiple criteria? For instance, I have a data.frame
with columns such as Date
, Time
, Item
, Value
, and I then want to search the data.frame
where I have Date = 1/2/2010
, Time = 5pm
, Item = Car
, Value = 5
, is there a function that will allow me to do that? More importantly, how do I obtain the row index of the data frame which has these values?
For example, say all these values are in the third row of the data frame, is there a function which will search the data frame row by row and then output that index is 3?
Upvotes: 4
Views: 35718
Reputation: 1934
You may use "which" as the following code:
df <- data.frame(cbind(Date = '1/2/2010', Time = '5pm', Item = 'Car', Value = 50000))
new <- data.frame(cbind(Date = '1/3/2010', Time = '6am', Item = 'keys', Value = 100))
df <- rbind(df, new)
searchIndex1 <- function(dd, tt, itm, val){
which(df$Date==dd & df$Time== tt & df$Item ==itm & df$Value == val)
}
searchIndex1(dd='1/3/2010', tt='6am', itm='keys', val=100)
It will return the Index number 2.
Or you may use "filter":
searchIndex2 <- function(dd, tt, itm, val){
df.with.index <- mutate(df, IDX = 1:n())
result <- filter(df.with.index,(Date==dd & Time== tt & Item ==itm & Value == val))$IDX
}
searchIndex2(dd='1/2/2010', tt='5pm', itm='Car', val=50000)
It will return Index 1.
Upvotes: 3
Reputation: 1994
Sounds like you have a question about performing a query. If you are familiar with dplyr
package, you'll find functions such as select
that can help. However, you should be able accomplish what you need just by using the base
and stats
packages.
For instance, given a data frame, you should extract the row indices that match your criteria. You can accomplish this by using the which
function:
indices <- which(data$Date == "1/2/2010" & data$Time == "5pm" & data$Item =="Car" & data$Value == 5)
Then you'd be ready to subset
data_subset <- data[indices, ]
I hope the above hypothetical example would help you get to the answer you need.
Upvotes: 7