Reputation: 317

How to filter R dataframes by using columns' numbers instead of their lables?

Here is the toy example:

d1 <- data.frame(
inds = 1:8, A = c("Go", "Nah", "Nah", "Go", "Nah", "Nah", "Go", "Go"), 
B = rep("Nah", 8), C = c("Go", "Nah", "Nah", "Go", "Go", "Nah", "Nah", "Go")
)

filter(d1, A == "Go" & C == "Go")

So the desired output is like this:

inds      A        B    C
<int>   <chr>   <chr>   <chr>
1        Go      Nah    Go
4        Go      Nah    Go
8        Go      Nah    Go

Now as you can see, I used labels of each column to define the logical condition. I need to know how can I do it by using the index (or number?) of each column.

Something in this spirit:

filter(d1, c(2, 4) == "Go")

I wish this would work, but it does not.

Tiny Update:

I got my answer, but I also needed something further that the accepted answer helped me. I thought maybe it would be useful for a few of the future viewers.

Say some of my columns ends in "_QC" and I need the index of those columns and then filter based on their values.

Thats how I did:

c1 <- c(grep("_QC", colnames(d1), fixed = TRUE))

filter(d1, across(c1, ~. == "Go"))

Upvotes: 0

Answers (3)

ThomasIsCoding

Reputation: 101688

A data.table option

> v <- c(2,4)
> setDT(d1)[, .SD[rowSums(.SD[, v, with = FALSE] == "Go") == length(v)]]
   inds  A   B  C
1:    1 Go Nah Go
2:    4 Go Nah Go
3:    8 Go Nah Go

Upvotes: 1

akrun

Reputation: 887213

We can also use Reduce from base R

subset(d1, Reduce(`&`, lapply(d1[c(2, 4)], `==`, 'Go')))
#  inds  A   B  C
#1    1 Go Nah Go
#4    4 Go Nah Go
#8    8 Go Nah Go

Upvotes: 1

Ronak Shah

Reputation: 389047

You can use across :

library(dplyr)
d1 %>% filter(across(c(2, 4), ~. == 'Go'))

#  inds  A   B  C
#1    1 Go Nah Go
#2    4 Go Nah Go
#3    8 Go Nah Go

In base R, we can use rowSums :

cols <- c(2, 4)
d1[rowSums(d1[cols] == 'Go') == length(cols), ]

Upvotes: 3

How to filter R dataframes by using columns&#39; numbers instead of their lables?

Answers (3)

Related Questions

How to filter R dataframes by using columns' numbers instead of their lables?