user10072474
user10072474

Reputation:

Mapping multiple columns using dplyr or tidyverse in R

My data are similar to the following data. Note some two surnames or names which I did not mentioned here

df1<-read.table(text= "id Surname Name
1234	Hamidsar	Smith
139	Sandarscom	Vicky
234	Bates	May
100	Murphu	Beki
941	Jool	Susan
469	Sali	John
990	susai	Alison",header=TRUE)

The another data frame is as follows:

df2<-read.table(text= "id Surname Name
990	susai	Anis
81B	Rosak	Roy
340	Molipoor	Vicky
139	Bates	May
941	Ameri	David
990	susai	Alison
139	Bates	May
101	CICI	Beki
139	Sandarscom	Vicky
1234	Hamidsar	Smith",header=TRUE)

I want to map id , surname and name to get the following table

id	Surname	Name	Map
1234	Hamidsar	Smith	id,Surname,Name
139	Sandarscom	Vicky	id,Surname,Name
234	Bates	May	Surname, Name
100	Murphu	Beki	Name
941	Jool	Susan	id
469	Sali	John	NA
990	susai	Alison	id,Surname,Name

Is it possible to get it ? I read some links, but they did not help me

Upvotes: 0

Views: 254

Answers (1)

Phil
Phil

Reputation: 8107

  library(tidyverse)
  library(glue)

First, I'm setting stringsAsFactors=FALSE to your code to produce the data to avoid pain down the road.

  df1<-read.table(text= "id Surname Name
1234    Hamidsar    Smith
139 Sandarscom  Vicky
234 Bates   May
100 Murphu  Beki
941 Jool    Susan
469 Sali    John
990 susai   Alison",header=TRUE, stringsAsFactors=FALSE)

  df2<-read.table(text= "id Surname Name
990 susai   Anis
81B Rosak   Roy
340 Molipoor    Vicky
139 Bates   May
941 Ameri   David
990 susai   Alison
139 Bates   May
101 CICI    Beki
139 Sandarscom  Vicky
1234    Hamidsar    Smith",header=TRUE, stringsAsFactors=FALSE)

Below I ensure id is set to character type so that it's the same in df2 and then I'm just applying a lookup to say "if this value is in df2, show the name of the variable, otherwise nothing". Then I'm using glue to put them together into a Map variable.

  df1 %>% 
    mutate(id = as.character(id),
           Map_id = if_else(id %in% df2$id, "id", NA_character_),
           Map_Surname = if_else(Surname %in% df2$Surname, "Surname", NA_character_),
           Map_Name = if_else(Name %in% df2$Name, "Name", NA_character_),
           Map = glue("{Map_id} {Map_Surname} {Map_Name}", .na = "") %>% 
             str_trim() %>% 
             str_replace_all(" ", ", ")) %>% 
    select(id, Surname, Name, Map)

#>     id    Surname   Name               Map
#> 1 1234   Hamidsar  Smith id, Surname, Name
#> 2  139 Sandarscom  Vicky id, Surname, Name
#> 3  234      Bates    May     Surname, Name
#> 4  100     Murphu   Beki              Name
#> 5  941       Jool  Susan                id
#> 6  469       Sali   John                  
#> 7  990      susai Alison id, Surname, Name

Upvotes: 2

Related Questions