Vishnu Sundaram
Vishnu Sundaram

Reputation: 25

How to use mapply to apply a function to two different columns in two different data frames

I am trying to find duplicates between two columns across two different data frames. After identifying duplicates, I want to extract an observation from the same row the duplication was on, but from a different column, and insert it into the other data frame. Let me provide an example:

Table1:
tobecopied   B   Checkfordup   D
Copy1        2   dupchk1       5
Copy2        3   dupchk5       4
Copy3        4   dupchk4       K

Table2:
tobepastedinto   B   Checkfordup   D
                 5   dupchk1       L
                 6   dupchk2       M
                 7   dupchk4       3

So after the code runs, Table two will look like this:

Updated Table2:

tobepastedinto   B   Checkfordup   D
Copy1            5   dupchk1       L
                 6   dupchk2       M
Copy3            7   dupchk4       3

What I have tried to do is make a function that does this and use mapply across both the columns. Here is what the code looks like:

             checknum <- function(x,y){
               if(y=x){
                 gsub(x,y,Table2$tobepastedinto)
               }
               else{""}
             }
            mapply(checknum,Table2$Checkfordup,Table1$Checkfordup)

The function is taking forever to run in R, and I am pretty sure I'm doing it incorrectly. Does anyone have a better solution for what I'm trying to do? Or is there a better way to use mapply?

EDIT: Here is the small data set. The NASET has no numbers in it. I want to see if any of mobiles in the Numberset match with those in the NASET and then add the corresponding Number to the NASET, even if the names do not match:

 NASET:
 name     Number     mobile
 VAN                 678
 GEORGE              6564
 STEVEN              76787



Numberset:
 name     Number     mobile
 TEU      7          678
 GEGE     6          64
 VEN      5          87
 TETK     7          678

Updated NASET:
NASET:
 name     Number     mobile
 VAN      7          678
 GEORGE              6564
 STEVEN              76787

Upvotes: 0

Views: 224

Answers (1)

akrun
akrun

Reputation: 887991

You could try

df2$tobepasteinto <- df1$tobecopied[match(df2$Checkfordup, df1$Checkfordup)]
df2$tobepasteinto[is.na(df2$tobepasteinto)] <- ''

Or

df2$tobepasteinto <-  mapply(function(x,y,z) {indx <- match(x,y)
                          ifelse(is.na(indx), '', z[indx])},
               df2$Checkfordup, list(df1$Checkfordup),list(df1$tobecopied))

Update

  NASET$Number <- Numberset$Number[match(NASET$mobile, Numberset$mobile)]
  NASET$Number[is.na(NASET$Number)] <- ''
  NASET
  #    name Number mobile
  #1    VAN      7    678
  #2 GEORGE          6564
  #3 STEVEN         76787

Or

  NASET$Number <- mapply(function(x,y,z) {
                     indx <- match(x,y)
                   ifelse(is.na(indx), '', z[indx])},
             NASET$mobile, list(Numberset$mobile), list(Numberset$Number))

Or

  library(dplyr)
  left_join(NASET[,-2], unique(Numberset[2:3]), by='mobile')
  #   mobile   name Number
  #1    678    VAN      7
  #2   6564 GEORGE     NA
  #3  76787 STEVEN     NA

data

df1 <-  structure(list(tobecopied = c("Copy1", "Copy2", "Copy3"), B = 2:4, 
Checkfordup = c("dupchk1", "dupchk5", "dupchk4"), D = c("5", 
"4", "K")), .Names = c("tobecopied", "B", "Checkfordup", 
"D"), class = "data.frame", row.names = c(NA, -3L))

 df2 <-  structure(list(tobepastedinto = c("", "", "", ""), B = 5:8,
  Checkfordup = c("dupchk1", "dupchk2", "dupchk4", "dupchk4"), 
  D = c("L", "M", "3", "5")), .Names = c("tobepastedinto", 
 "B", "Checkfordup", "D"), row.names = c(NA, -4L), class = "data.frame")

newdata

  NASET <- structure(list(name = c("VAN", "GEORGE", "STEVEN"), Number = c(NA, 
  NA, NA), mobile = c(678L, 6564L, 76787L)), .Names = c("name", 
  "Number", "mobile"), class = "data.frame", row.names = c(NA, -3L))

 Numberset <- structure(list(name = c("TEU", "GEGE", "VEN", "TETK"),
 Number = c(7L, 6L, 5L, 7L), mobile = c(678L, 64L, 87L, 678L)), .Names =
  c("name", "Number", "mobile"), class = "data.frame", row.names = c(NA, 
 -4L))

Upvotes: 1

Related Questions