Melanie Shebel
Melanie Shebel

Reputation: 2913

Remove rows in csv if value in cell is not located in another csv

I am new to this so I hope this is not a dumb question and that it's clear. I have two csv files and each file has a column named "admin." If a particular admin does not appear in the other file, I would like to remove all rows that contain the name.

For example, if "Lisa Kennedy" appears in the admin column in the main file, but not the other file, I would like to remove all lines in the main file that have her listed as admin. I only need to edit the main file. "Lisa Kennedy" may appear as admin in multiple rows and I would like to remove all instances.

My thought was to create an array with all the admin name and iterate through the array (and the main file) and remove them that way, but it would be slow and I'm wondering if there is a more sophisticated way to do this.

Upvotes: 0

Views: 260

Answers (1)

sconfluentus
sconfluentus

Reputation: 4993

You can do an anti_join, as suggested above or you can create a list and filter as follows:

# build the list of admins from one list
admin_list<-unique(df_2$admins)

output<- df_1%>%
  filter(!admins %in% admin_list) 
# ! means not, %in% applies our list

Upvotes: 1

Related Questions