wetcoaster
wetcoaster

Reputation: 367

Filter value in Multiple Columns in R

I am trying to filter a data frame based on a vector value (that comes from a loop) in multiple columns at the same time.

As this takes place in a loop, here is the pertinent steps:

name.id = NamesList$`First Name` [i]

In the above, I identify the unique row (name) I want to pass in this loop.

Next, I want to filter that value against my dataframe (referenced as test.df in this example) to find in all the columns that start with an 'x' (as there will be more than 3 in the real application) the rows where the targeted name appears in group 'A'.

output.df = test.df %>% filter(grepl('A', Group) & (c(x1, x2, x3) %in% name.id))

For example, using the sample data below, for the first pass of the loop, 'JOE' will be the name identified and filtered. I know I could create a long list saying x1 %in% name.id, x2 %in% name.id, but there will be 5+ columns and I know there's a more elegant way to reference the columns to filter than this.

Sample data:

x1 <- c('JOE','JOE','JOE','JOE', 'JOE', 'JOE', 'JOE', 'JOE', 'JOE', 'JOE', '','','','', 'FRED','FRED','FRED','FRED', 'FRED','FRED','JOE','JOE', 'FRED','FRED','JOE','JOE')

x2 <- c('ERIC','ERIC','ERIC','ERIC', 'ERIC', 'ERIC', 'ERIC', 'ERIC', 'ERIC', 'ERIC', '','','','', 'JOE','JOE','JOE','RON', 'RON','RON','RON','RON', 'RON','RON','FRED','FRED')

x3 <- c('SARAH','SARAH','SARAH','SARAH', 'SARAH', 'SARAH', 'SARAH', 'JOE', 'JOE', 'JOE', 'JOE','','','', 'JAY','JAY','JAY','JAY', 'JAY','JAY','JAY','JAY','JAY','','RON','RON')

State <- c('1','1','1','1', '1', '1', '1', '1', '1', '1', '2','2','2','2', '2','2','2','2', '2','2','2','2', '2','1','1','1')

Group <- c('A','A','A','B', 'B', 'B', 'A', 'B', 'A', 'B', 'A','A','A','B', 'A','A','A','B', 'NA','B','B','B', 'B', 'A','B','A')

test.df=cbind.data.frame(x1, x2, x3, State, Group)

Upvotes: 0

Views: 252

Answers (1)

JonMinton
JonMinton

Reputation: 1279

Using tidyverse.

require(tidyverse)

dta <- data_frame(State, Group, x1, x2, x3)
dta %>% 
    gather(key = "key", value = "value", x1:x3) %>%
    filter(value %in% [condition to match])

The gather function moves the three columns x1, x2, x3 into two columns comprising key-value pairs. You can then filter on the value column alone.

Upvotes: 1

Related Questions