Reputation: 541
I have DF1 like this:
ID Name Team
222717 Bob Badgers
321817 James Tigers
521917 Eric Possums
And DF2 like this:
Badgers Tigers Possums
222717 438283 521917
789423 978748 251233
I want to check if the ID in DF1 appears in the corresponding team name in DF2. For example, in the first row, Bob's ID does appear under his team name, "Badgers," in DF2. James' ID does not appear under his team name, "Tigers," in DF2. I was thinking of adding a column that marks whether it appears or not, but can't figure out how to reference the column in DF2. Here's what I tried.
test <- mutate(DF1,validID=ifelse(ID%in%DF2$DF1$Team,"Yes",NA))
The DF2$DF1$Team
part is where I'm stuck. How do I reference the column in DF2 that corresponds to the team listed in DF1? Also open to alternative suggestions on how to manipulate the data to achieve this task.
Upvotes: 0
Views: 536
Reputation: 263342
The %in%
function is a compact way to access the match
function. mapply
is the canonical method to supply multiple columns for evaluation of their corresponding values in sequence.
DF1$right2 <- mapply( function(a,b) {a %in% DF2[[b]]}, a=DF1$ID, b=as.character(DF1$Team) )
#============
> DF1
ID Name Team right2
1 222717 Bob Badgers TRUE
2 321817 James Tigers FALSE
3 521917 Eric Possums TRUE
Upvotes: 3
Reputation: 15072
Honestly I find mapply
hard to conceptualise, and in any case 42's answer seems to return FALSE for Eric, when it ought to return true. Most likely a typo, but for future reference it's helpful to give your sample data in a format that lets you just copy the code and create the right objects!
This is a quick way of doing it avoiding map
or apply
functions, with only tidyverse
tools (and a magrittr
alias, but you can sub that out). Here I split the "finding the right column" and "checking if ID is there" into two steps, but you could combine if you wanted.
library(tidyverse)
library(magrittr)
df1 <- tibble(ID = c(222717, 321817, 521917),
Name = c("Bob", "James", "Eric"),
Team = c("Badgers", "Tigers", "Possums")
)
df2 <- tibble(Badgers = c(222717, 789423),
Tigers = c(438283, 978748),
Possums = c(521917, 251233)
)
df1 %>%
mutate(team_col = colnames(df2) %>% equals(Team) %>% which()) %>%
mutate(id_exists_for_team = ID %in% as_vector(df2[team_col]))
#> # A tibble: 3 x 5
#> ID Name Team team_col id_exists_for_team
#> <dbl> <chr> <chr> <int> <lgl>
#> 1 222717 Bob Badgers 1 TRUE
#> 2 321817 James Tigers 2 FALSE
#> 3 521917 Eric Possums 3 TRUE
Upvotes: 0