orangeman51
orangeman51

Reputation: 541

Reference column in another data frame by row entry

I have DF1 like this:

ID      Name      Team
222717  Bob       Badgers
321817  James     Tigers
521917  Eric      Possums

And DF2 like this:

Badgers    Tigers    Possums
222717     438283    521917
789423     978748    251233

I want to check if the ID in DF1 appears in the corresponding team name in DF2. For example, in the first row, Bob's ID does appear under his team name, "Badgers," in DF2. James' ID does not appear under his team name, "Tigers," in DF2. I was thinking of adding a column that marks whether it appears or not, but can't figure out how to reference the column in DF2. Here's what I tried.

test <- mutate(DF1,validID=ifelse(ID%in%DF2$DF1$Team,"Yes",NA))

The DF2$DF1$Team part is where I'm stuck. How do I reference the column in DF2 that corresponds to the team listed in DF1? Also open to alternative suggestions on how to manipulate the data to achieve this task.

Upvotes: 0

Views: 536

Answers (2)

IRTFM
IRTFM

Reputation: 263342

The %in% function is a compact way to access the match function. mapply is the canonical method to supply multiple columns for evaluation of their corresponding values in sequence.

DF1$right2 <- mapply( function(a,b) {a %in% DF2[[b]]}, a=DF1$ID, b=as.character(DF1$Team) )
#============
> DF1
      ID  Name    Team right2
1 222717   Bob Badgers   TRUE
2 321817 James  Tigers  FALSE
3 521917  Eric Possums   TRUE

Upvotes: 3

Calum You
Calum You

Reputation: 15072

Honestly I find mapply hard to conceptualise, and in any case 42's answer seems to return FALSE for Eric, when it ought to return true. Most likely a typo, but for future reference it's helpful to give your sample data in a format that lets you just copy the code and create the right objects!

This is a quick way of doing it avoiding map or apply functions, with only tidyverse tools (and a magrittr alias, but you can sub that out). Here I split the "finding the right column" and "checking if ID is there" into two steps, but you could combine if you wanted.

library(tidyverse)
library(magrittr)
df1 <- tibble(ID = c(222717, 321817, 521917),
              Name = c("Bob", "James", "Eric"),
              Team = c("Badgers", "Tigers", "Possums")
              )
df2 <- tibble(Badgers = c(222717, 789423),
              Tigers = c(438283, 978748),
              Possums = c(521917, 251233)
              )
df1 %>%
  mutate(team_col = colnames(df2) %>% equals(Team) %>% which()) %>%
  mutate(id_exists_for_team = ID %in% as_vector(df2[team_col]))
#> # A tibble: 3 x 5
#>       ID  Name    Team team_col id_exists_for_team
#>    <dbl> <chr>   <chr>    <int>              <lgl>
#> 1 222717   Bob Badgers        1               TRUE
#> 2 321817 James  Tigers        2              FALSE
#> 3 521917  Eric Possums        3               TRUE

Upvotes: 0

Related Questions